Precision Matters: Scheduling GPU Workloads on Kubernetes

Overview

Explore a comprehensive conference talk on optimizing GPU workload scheduling in Kubernetes environments. Dive into Uber's approach to supporting AI/ML workloads, including LLM training, with GPU acceleration. Learn about implementing NVidia device plugins for GPU resource management, utilizing cadvisor for GPU metrics, and employing scheduler plugins for efficient workload distribution across heterogeneous clusters. Discover strategies for handling different GPU SKUs, implementing precise scheduling algorithms, and balancing load-aware scheduling with bin packing techniques. Gain insights into future developments, including fractional GPU support, topology-aware scheduling, and expanding support for various GPU providers like AMD and Intel.

Syllabus

Precision Matters: Scheduling GPU Workloads on Kubernetes - Amit Kumar & Gaurav Kumar, Uber

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Precision Matters: Scheduling GPU Workloads on Kubernetes

Taught by

Improving GPU Utilization and Accelerating Model Training with Kubernetes Scheduling Framework and NRI

Scaling AI Workloads with Kubernetes - Sharing GPU Resources Across Multiple Containers

Increasing GPU Utilization on Kubernetes Clusters for AI/ML Workloads

9 Best Kubernetes Courses for 2024

Never Stop Learning.