Overview
Explore strategies for improving GPU utilization in Kubernetes environments in this 38-minute conference talk by Maulin Patel and Pradeep Venkatachalam from Google. Learn about the limitations of Kubernetes' current GPU allocation system, which requires integer-based resource requests, leading to potential overprovisioning and waste. Discover user-friendly solutions developed to enable multiple containers to share a single GPU, enhancing resource efficiency and reducing costs. Gain insights into the performance results of these solutions through demonstrations and real-world examples, with a focus on optimizing GPU usage for inference workloads processing real-time data samples.
Syllabus
Improving GPU Utilization using Kubernetes - Maulin Patel & Pradeep Venkatachalam, Google
Taught by
CNCF [Cloud Native Computing Foundation]