Overview
Master CUDA programming and harness the power of GPUs for high-performance computing and deep learning in this comprehensive 11-hour 55-minute course. Begin with an introduction to the deep learning ecosystem before diving into CUDA setup and a C/C++ review. Explore GPU architecture and learn to write your first CUDA kernels. Delve into the CUDA API and optimize matrix multiplication techniques. Discover Triton, a language for writing fast GPU code, and create PyTorch extensions. Apply your skills by implementing an MNIST multi-layer perceptron. Access accompanying code on GitHub, connect with the instructor on various platforms, and gain practical experience to accelerate your high-performance computing projects.
Syllabus
⌨️ Intro
⌨️ Chapter 1 Deep Learning Ecosystem
⌨️ Chapter 2 CUDA Setup
⌨️ Chapter 3 C/C++ Review
⌨️ Chapter 4 Intro to GPUs
⌨️ Chapter 5 Writing your First Kernels
⌨️ Chapter 6 CUDA API
⌨️ Chapter 7 Faster Matrix Multiplication
⌨️ Chapter 8 Triton
⌨️ Chapter 9 PyTorch Extensions
⌨️ Chapter 10 MNIST Multi-layer Perceptron
⌨️ Chapter 11 Next steps?
⌨️ Outro
Taught by
freeCodeCamp.org