Overview
Syllabus
Intro
Early Adoption of Horovod
Deep Learning Refresher
Distributed Deep Learning
Early Distributed Training - Parameter Servers
Parameter Servers - Tradeoffs
Horovod Technique: Allreduce
Benchmarking
Deep Learning in Research
Deep Learning in Production
Feature Store
Model Training
Preprocessing
Spark ML Pipelines
Petastorm: Data Access for Deep Learning Training Challenges of Training on Large Datasets
Spark 3.0: Resource Aware Scheduling
What if my Spark cluster doesn't have GPUs? Horovod Lambda - Run data processing on CPUs with Spark
Online Prediction
Neuropod: Out-of-Process Execution
Workflow Authoring Can we ideate, define, evaluate and deploy a Deep Learning model all within a single script?
Feature Engineering
Model Construction
Model Deployment
Elastic Horovod: Control Flow
Taught by
Linux Foundation