Unlocking the Full Potential of GPUs for AI Workloads on Kubernetes

Overview

Explore the groundbreaking Dynamic Resource Allocation (DRA) feature in Kubernetes for optimizing GPU utilization in AI workloads. Delve into how this new approach revolutionizes resource scheduling by empowering third-party developers and moving beyond the limitations of traditional "countable" interfaces. Discover the extensive capabilities unlocked for GPU management, including controlled GPU sharing within and across pods, support for multiple GPU models per node, specification of arbitrary GPU constraints, and dynamic allocation of Multi-Instance GPUs (MIG). Learn about NVIDIA's DRA resource driver for GPUs, examining its key features and functionalities. Conclude with practical demonstrations showcasing how to implement and leverage this powerful tool in your Kubernetes environment, enabling more efficient and flexible GPU resource management for AI workloads.

Syllabus

Unlocking the Full Potential of GPUs for AI Workloads on Kubernetes - Kevin Klues, NVIDIA

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Unlocking the Full Potential of GPUs for AI Workloads on Kubernetes

Taught by

Building a Driver for Dynamic Resource Allocation in Kubernetes - Device Plugins 2.0

Accelerating AI Workloads with GPUs in Kubernetes

A Deep Dive on Supporting Multi-Instance GPUs in Containers and Kubernetes

Which GPU Sharing Strategy Is Right for You? A Comprehensive Benchmark Study Using Dynamic Resource Allocation

Scaling AI Workloads with Kubernetes - Sharing GPU Resources Across Multiple Containers

What Can I Get You? An Introduction to Dynamic Resource Allocation

9 Best Kubernetes Courses for 2024

Never Stop Learning.