Overview
Syllabus
Intro
Deep Learning @ UBER
Self-Driving Vehicles
Trip Forecasting
Fraud Detection
Why Distributed Deep Learning?
How Distributed Deep Learning Works
Why Mesos?
Mesos Support for GPUs
Mesos Nested Containers
What is Missing?
Peloton Overview
Peloton Architecture
Elastic GPU Resource Management
Resource Pools
Gang Scheduling
Placement Strategies
Why TensorFlow?
Architecture for Distributed TensorFlow on Mesos
Can We Do Better?
Architecture for Horovod on Mesos
Distributed Training Performance with Horovod
What About Usability?
Giving Back
Thank you!
Taught by
Linux Foundation