Overview
Explore distributed training techniques in machine learning with this comprehensive lecture from MIT's 6.5940 course. Delve into the first part of distributed training, led by Prof. Song Han, as part of the EfficientML.ai series. Learn about the fundamental concepts, challenges, and strategies for scaling machine learning models across multiple devices or nodes. Gain insights into parallel processing, data parallelism, and model parallelism techniques used to accelerate training of large-scale neural networks. Discover how distributed training can significantly reduce computation time and enable the development of more complex models. Access accompanying slides at efficientml.ai to enhance your understanding of this critical topic in efficient machine learning.
Syllabus
EfficientML.ai Lecture 17: Distributed Training (Part I) (MIT 6.5940, Fall 2023)
Taught by
MIT HAN Lab