Next-Generation Networks for Machine Learning

Overview

Explore cutting-edge techniques for accelerating distributed deep neural network (DNN) training in this 50-minute conference talk by Manya Ghobadi at SPCL_Bcast. Delve into the challenges posed by increasing dataset and model sizes, and discover innovative solutions to overcome network bottlenecks in datacenter environments. Learn about a novel optical fabric that optimizes network topology and parallelization strategies for DNN clusters. Examine the limitations of fair-sharing in congestion control algorithms and understand a new scheduling approach that strategically places jobs on network links to enhance performance. Gain insights into the future of machine learning infrastructure and network design for improved training efficiency.

Syllabus

Introduction
Talk
Announcements

Taught by

Scalable Parallel Computing Lab, SPCL @ ETH Zurich

Reviews

Start your review of Next-Generation Networks for Machine Learning

Taught by

PANAMA: In-Network Aggregation for Shared Machine Learning Clusters

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.