Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive 44-minute talk exploring AIR's data processing engine for scaling training and batch inference. Learn how Ray AIR leverages Ray Datasets to achieve high performance and scalability in ML pipelines. Discover techniques for efficient data loading and preprocessing across multiple machines, addressing ingest bottlenecks and maximizing GPU utilization. Explore key features like distributed data sharding, parallel I/O, transformations, CPU-GPU compute pipelining, autoscaling inference workers, and efficient per-epoch shuffling. Gain insights from real-world case studies of production AIR workloads, showcasing performance and scalability benefits. Master the creation of scalable training and batch inference pipelines using Ray AIR to optimize your machine learning workflows.
Syllabus
Intro
Overview
ML Pipelines Must Scale with Data
Distributed Data-Parallel to the Rescue
Scaling the Typical ML Pipeline
Possible Solution - Coordinated Pipelining
Ray Datasets: AIR's Data Processing Engine
Avoiding GPU Data Prep Stalls
Dataset Sharding
Parallel I/O and Transformations
Dataplane Optimizations
Pipelining Ingest with Training
Pipelining Ingest with Inference
Autoscaling Actor Pool for Inference
Per-epoch Shuffling - Distributed
ML engineer at Telematics Startup
Summary
Taught by
Anyscale