Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore cutting-edge techniques for accelerating distributed deep neural network (DNN) training in this 50-minute conference talk by Manya Ghobadi at SPCL_Bcast. Delve into the challenges posed by increasing dataset and model sizes, and discover innovative solutions to overcome network bottlenecks in datacenter environments. Learn about a novel optical fabric that optimizes network topology and parallelization strategies for DNN clusters. Examine the limitations of fair-sharing in congestion control algorithms and understand a new scheduling approach that strategically places jobs on network links to enhance performance. Gain insights into the future of machine learning infrastructure and network design for improved training efficiency.