Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Precision Matters: Scheduling GPU Workloads on Kubernetes

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Explore a comprehensive conference talk on optimizing GPU workload scheduling in Kubernetes environments. Dive into Uber's approach to supporting AI/ML workloads, including LLM training, with GPU acceleration. Learn about implementing NVidia device plugins for GPU resource management, utilizing cadvisor for GPU metrics, and employing scheduler plugins for efficient workload distribution across heterogeneous clusters. Discover strategies for handling different GPU SKUs, implementing precise scheduling algorithms, and balancing load-aware scheduling with bin packing techniques. Gain insights into future developments, including fractional GPU support, topology-aware scheduling, and expanding support for various GPU providers like AMD and Intel.

Syllabus

Precision Matters: Scheduling GPU Workloads on Kubernetes - Amit Kumar & Gaurav Kumar, Uber

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Precision Matters: Scheduling GPU Workloads on Kubernetes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.