Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

CUDA Crash Course

via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive 7-hour crash course on CUDA programming, covering essential topics from basic vector addition to advanced GPU performance optimizations. Learn to implement and optimize various algorithms including matrix multiplication, sum reduction, and convolution using CUDA. Explore unified memory, cache tiling, coalescing, and the use of libraries like cuBLAS. Gain practical experience with hands-on exercises in both Windows and Linux environments, and understand crucial concepts such as spatial thinking and handling non-perfect input sizes. Master profiling techniques and discover how to maximize GPU performance through a series of in-depth lessons and real-world examples.

Syllabus

CUDA Crash Course: Vector Addition.
CUDA Crash Course: Unified Memory Vector Add.
CUDA Crash Course: Matrix Multiplication.
CUDA Crash Course: Cache Tiled Matrix Multiplication.
CUDA Crash Course: Why Coalescing Matters.
CUDA Crash Course: cuBLAS Vector Add.
CUDA Crash Course: cuBLAS Matrix Multiplication.
CUDA Crash Course: Sum Reduction Part 1.
CUDA Crash Course: Sum Reduction Part 2.
CUDA Crash Course: Sum Reduction Part 3.
CUDA Crash Course: Sum Reduction Part 4.
CUDA Crash Course: Sum Reduction Part 5.
CUDA Crash Course: Visual Studio 2017 Environment Setup.
CUDA Crash Course: Programming in Linux.
CUDA Crash Course: Video Corrections.
CUDA Crash Course: Sum Reduction Part 6.
CUDA Crash Course: Naive 1-D Convolution.
CUDA Crash Course: 1-D Convolution with Constant Memory.
CUDA Crash Course: Tiled 1-D Convolution.
CUDA Crash Course: 1-D Convolution Cache Simplification.
CUDA Crash Course: 2-D Convolution.
CUDA Crash Course: Thinking Spatially.
CUDA Crash Course: Optimizing Histogram Kernels.
CUDA Crash Course: Comparing Matrix Multiplication Implementations.
CUDA Crash Course: Comparing Sum Reduction Implementations.
CUDA Crash Course: Handling Non-Perfect Input Sizes.
CUDA Crash Course: OpenACC Matrix Multiplication.
CUDA Crash Course: Device Properties.
CUDA Crash Course: Profiling with clock().
CUDA Crash Course: GPU Performance Optimizations Part 1.

Taught by

CoffeeBeforeArch

Reviews

5.0 rating, based on 1 Class Central review

Start your review of CUDA Crash Course

  • Nick's course is a very intensive and complete course to get introduced into CUDA C++ programming. I enjoyed both the video lessons and the code available on GitHub and I enjoyed his very clear way of teaching CUDA and going into the details of GPU architecture. I absolutely suggest this course!

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.