Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Scaling Kubernetes Clusters for Generative Models - Managing GPU Resources for AI Applications

Linux Foundation via YouTube

Overview

Explore techniques for efficiently scaling Generative AI workloads using Kubernetes in this 32-minute talk by Jack Min Ong from Jina AI. Delve into the challenges of GPU resource management and learn how to leverage Kubernetes coupled with the NVIDIA GPU operator to configure and consume GPU resources at scale. Discover various methods for sharding GPU devices and optimizing GPU usage in generative model pipelines. Gain a comprehensive understanding of provisioning and sharing GPU resources across multiple containers, enabling you to maximize GPU investments and accelerate Generative AI applications.

Syllabus

Scaling Kubernetes Clusters for Generative Models: Managing GPU Resources for AI App... Jack Min Ong

Taught by

Linux Foundation

Reviews

Start your review of Scaling Kubernetes Clusters for Generative Models - Managing GPU Resources for AI Applications

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.