Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how Lyft leverages Ray on Kubernetes for distributed training in this insightful conference talk. Discover the ML platform's infrastructure built entirely on Kubernetes, highlighting its scalability and rapid resource bootstrapping capabilities. Learn about the custom SDKs developed to enable users to spawn on-demand Ray clusters for model training directly from notebooks. Gain valuable insights into how these SDKs abstract and conceal the complexities of cluster management, allowing users to focus on their core tasks while the platform handles the technical details. Understand the innovative approach to creating a robust infrastructure for distributed training, combining the power of Ray with the flexibility of Kubernetes.
Syllabus
Distributed training with Ray on Kubernetes at Lyft
Taught by
Anyscale