Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Getting Started with OpenACC - Part II

Nvidia via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into the second part of a comprehensive tutorial on OpenACC, presented by Jeff Larkin from Nvidia. Explore advanced concepts in parallel programming, including Jacobi Iteration implementation, compiler output analysis, and optimization techniques. Learn to offload parallel kernels, manage data transfers efficiently, and utilize data directives for improved performance. Discover how to integrate OpenACC with MPI for distributed computing, and gain valuable tips and tricks for enhancing your GPU-accelerated code. Master the use of OpenACC directives such as update and host_data, and understand the importance of the C restrict keyword in optimizing performance.

Syllabus

Intro
Example: Jacobi Iteration
Jacobi Iteration: C Code
Jacobi Iteration: OpenACC C Code
PGI Accelerator Compiler output (C)
What went wrong? • Set PGI_ACC_TIME environment variable to '1'
Offloading a Parallel Kernel
Separating Data from Computation
Excessive Data Transfers
Defining data regions
Data Clauses copy (diet) Allocates memory on GPU and copies data from host to GPU when entering region and coples data to the
Array Shaping
Jacobi Iteration: Data Directives
Execution Time (lower is better)
Further speedups
Calling MPI with OpenACC (Standard MPI)
OpenACC update Directive
OpenACC host_data Directive
Calling MPI with OpenACC (GPU-aware MPI)
C tip: the restrict keyword
Tips and Tricks (cont.)

Taught by

NVIDIA Developer

Reviews

Start your review of Getting Started with OpenACC - Part II

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.