Overview
Explore the challenges and solutions of running Apache Spark on Kubernetes at scale in this insightful conference talk. Discover how to leverage open-source technologies like Apache YuniKorn, Spark K8s operator, and Cloud primitives to evolve ML data infrastructure in the cloud. Learn about considerations for multi-tenancy, observability, scalability, and cost-effectiveness when implementing Spark on Kubernetes for large-scale production environments. Gain valuable insights on overcoming roadblocks such as dev-ops complexity, multi-cluster management, job scheduling, and autoscaling to maximize data processing capabilities on the cloud.
Syllabus
Beyond Experimental: Spark on Kubernetes - Weiwei Yang, Apple
Taught by
CNCF [Cloud Native Computing Foundation]