Completed
Making Collective Communication Asynchronous Idea: Use asynchronous collective communication
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
KungFu - Making Training in Distributed Machine Learning Adaptive
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Training in Distributed ML Systems
- 3 Parameters in Distributed ML Systems
- 4 Issues with Empirical Parameter Tuning
- 5 Proposals for Automatic Parameter Adaptation
- 6 Open Challenges
- 7 Existing Approaches for Adaptation
- 8 KungFu Overview
- 9 Adaptation Policies
- 10 Example: Adaptation Policy for GNS
- 11 Embedding Monitoring Inside Dataflow Problem: High monitoring cost reduces adaptation benefit Idea: Improve efficiency by adding monitoring operators to dataflow graph
- 12 Challenges of Dataflow Collective Communication
- 13 Making Collective Communication Asynchronous Idea: Use asynchronous collective communication
- 14 Issues When Adapting System Parameters
- 15 Distributed Mechanism for Parameter Adaptation
- 16 How Effectively Does KungFu Adapt?
- 17 Conclusions: Kung Fu