Improvements to NVIDIA CUDA and Deep Learning Libraries - Session 1

Improvements to NVIDIA CUDA and Deep Learning Libraries - Session 1

NVIDIA Developer via YouTube Direct link

PERSISTENT RNN SPEEDUP ON V100

25 of 30

25 of 30

PERSISTENT RNN SPEEDUP ON V100

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Improvements to NVIDIA CUDA and Deep Learning Libraries - Session 1

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 CUDA DEVELOPMENT ECOSYSTEM
  3. 3 POWERING THE DEEP LEARNING ECOSYSTEM
  4. 4 TESLA UNIVERSAL ACCELERATION PLATFORM
  5. 5 ACCELERATED COMPUTING IS FULL-STACK OPTIMIZATION
  6. 6 INTRODUCING CUDA 10,0
  7. 7 16 GPUS WITH 32GB MEMORY EACH
  8. 8 NVSWITCH: ALL-TO-ALL CONNECTIVITY
  9. 9 UNIFIED MEMORY + DGX-2
  10. 10 2X HIGHER PERFORMANCE WITH NVSWITCH
  11. 11 NEW PROGRAMMING MODEL FEATURES
  12. 12 ASYNCHRONOUS TASK GRAPHS
  13. 13 NEW EXECUTION MECHANISM
  14. 14 EXECUTION OPTIMIZATIONS
  15. 15 PERFORMANCE IMPACT
  16. 16 THE PATH TO FUSION ENERGY
  17. 17 VOLTA TENSOR CORE
  18. 18 NEW TURING TENSOR CORE
  19. 19 NEW TURING WARP MATRIX FUNCTIONS
  20. 20 CUTLASS 1.1
  21. 21 NVIDIA NGX: DL FOR CREATIVE APPLICATIONS
  22. 22 IN ADOBE PHOTOSHOP
  23. 23 CUDNN: GPU ACCELERATED DEEP LEARNING
  24. 24 IMPROVED HEURISTICS FOR CONVOLUTIONS
  25. 25 PERSISTENT RNN SPEEDUP ON V100
  26. 26 STRIDED ACTIVATION GRADIENTS
  27. 27 TENSORCORES WITH FP32 MODELS
  28. 28 MORE TENSORCORE PERFORMANCE IMPROVEMENTS
  29. 29 GENERAL PERFORMANCE IMPROVEMENTS
  30. 30 FUTURE UPDATES

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.