Building a Batch System for the Cloud with Kueue
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the key concepts of Kueue, a cloud-native job scheduler, in this 29-minute conference talk. Learn how to address resource constraints in batch, HPC, and AI/ML clusters serving multiple teams and researchers. Discover how Kueue works in combination with the default Kubernetes scheduler, job controller, and cluster-autoscaler to provide a comprehensive batch system. Understand how Kueue implements job queueing, making decisions on when jobs should wait or start based on quotas and a hierarchy for fair resource sharing among teams. Gain insights into Kueue's effectiveness in cloud environments with heterogeneous, fungible resources that can be scaled for cost optimization. Learn how to model your teams and resources to transform your Kubernetes cluster into an efficient batch system using Kueue.
Syllabus
Building a Batch System for the Cloud with Kueue - Aldo Culquicondor, Google & Kante Yin, DaoCloud
Taught by
CNCF [Cloud Native Computing Foundation]