Distributed Training and Gradient Compression - Lecture 14

Distributed Training and Gradient Compression - Lecture 14

MIT HAN Lab via YouTube Direct link

Limitations of Sparse Communication

4 of 15

4 of 15

Limitations of Sparse Communication

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Distributed Training and Gradient Compression - Lecture 14

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Problems of Distributed Training
  3. 3 Reduce Transfer Data Size Recall the workflow of Parameter-Server Based Distributed Training
  4. 4 Limitations of Sparse Communication
  5. 5 Optimizers with Momentum Repeat, update weights
  6. 6 Deep Gradient Compression
  7. 7 Comparison of Gradient Pruning Method
  8. 8 Latency Bottleneck
  9. 9 High Network Latency Slows Federated Lea
  10. 10 Conventional Algorithms Suffer from High La Vanilla Distributed Synchronous SGD
  11. 11 Delayed Gradient Averaging
  12. 12 DGA Accuracy Evaluation
  13. 13 Real-world Benchmark
  14. 14 Summary of Today's Lecture
  15. 15 References

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.