Overview
Explore the communication bottlenecks of distributed training in this 58-minute lecture from MIT's HAN Lab. Dive into bandwidth and latency challenges, and learn about gradient compression techniques such as gradient pruning and quantization to address bandwidth limitations. Discover how delayed gradient averaging can help mitigate latency issues in distributed training scenarios. Gain insights into efficient machine learning techniques for deploying neural networks on resource-constrained devices like mobile phones and IoT devices. Access accompanying slides and explore topics including model compression, pruning, quantization, neural architecture search, and distillation as part of the broader TinyML and Efficient Deep Learning Computing course.
Syllabus
Lecture 14 - Distributed Training and Gradient Compression (Part II) | MIT 6.S965
Taught by
MIT HAN Lab