Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the democratization of machine learning on Kubernetes in this 38-minute Docker conference talk. Learn about data and model parallelization, distributed flow training, and TensorFlow tools. Discover training environments, model performance, and distributed training results. Examine compute-to-communication ratios and other observations. Investigate potential improvements, including Uber's cluster performance, FreeFlow on CNI, GPU resource scheduling, and Fast AI. Gain insights into the importance of making machine learning more accessible and efficient on Kubernetes platforms.
Syllabus
Introduction
Who are we
Why is this important
Data Parallelization
Model Parallelization
Distributed Flow Training
Tensorflow Tools
Demos
Training Environment
Model Performance
Distributed Training
Distributed Training Results
Compute to Communication Ratio
Other Observations
How Can We Improve
Ubers
Cluster Performance
FreeFlow on CNI
GPU Resource Scheduler
Fast AI
Taught by
Docker