Completed
Throughput Scalability
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
ZeRO-Offload - Democratizing Billion-Scale Model Training
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 The Size of Deep Learning Model is increasing Quickly
- 3 Billon-Scale Model Training - Scale Out Large
- 4 Mixed-precision training
- 5 Limiting CPU Computation
- 6 Minimizing Communication Volume
- 7 ZeRO-Offload enables large model training , offloading data and compute to CPU
- 8 Unique Optimal Offload Strategy
- 9 ZERO-Offload Single GPU Schedule
- 10 ZERO-Offload Multi-GPUs Schedule
- 11 Optimized CPU Execution
- 12 Evaluation
- 13 Model Scale
- 14 Training Throughput - Single GPU
- 15 Training Throughput - Multiple GPUs
- 16 Throughput Scalability
- 17 One-step Delayed Parameter Update (DPU)
- 18 Conclusions