Kubernetes Batch Processing at Scale - A Scheduling Perspective
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Explore the challenges and solutions of large-scale batch processing on Kubernetes in this conference talk by Lim Haw Jia and Fan Deliang from ByteDance. Discover why Kubernetes was adopted for batch processing and how it handles hundreds of thousands of daily jobs. Learn about the development of a custom Kubernetes scheduler designed to support massive batch processing workloads across clusters with up to 20,000 nodes. Dive into concepts like Gang Scheduling and Dominant Resource Fairness (DRF) and their application in Kubernetes. Understand how parallelizing computationally intensive parts of the scheduling framework improved scalability. Gain insights into colocating batch processing workloads with microservices for better resource utilization and cost savings.
Syllabus
Kubernetes Batch Processing at Scale - A Scheduling Perspective - Lim Haw Jia & Fan Deliang
Taught by
CNCF [Cloud Native Computing Foundation]