Overview
Learn about distributed training fundamentals in machine learning through a recorded MIT lecture that explores parallelization methods, data parallelism, and memory optimization techniques. Dive into essential concepts including communication primitives, ZeRO and FSDP memory reduction strategies, pipeline parallelism, tensor parallelism, and sequence parallelism. Taught by Professor Song Han, the 70-minute lecture provides comprehensive coverage of background, motivation, and various parallelization approaches used in modern distributed ML training systems.
Syllabus
EfficientML.ai Lecture 19 - Distributed Training Part 1 (Zoom Recording) (MIT 6.5940, Fall 2024)
Taught by
MIT HAN Lab