Better Sharing of AI Accelerators in OpenStack with Blazar Reservations
OpenInfra Foundation via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to optimize GPU resource sharing in OpenStack environments through a 47-minute conference talk from OpenInfra Day France 2024. Explore the challenges of managing GPU devices for AI and machine learning workloads in private clouds, where limited availability and high costs create resource contention issues. Discover how the OpenStack Blazar project enables advanced resource reservation capabilities, with a focus on StackHPC's developments for AI-optimized flavors, GPU accelerator allocation, and CPU pinning enhancements. Presented by Pierre Riteau from StackHPC, delve into practical solutions for improving resource utilization efficiency and implementing better sharing mechanisms for AI infrastructure components in OpenStack deployments.
Syllabus
Get off my GPU! Better sharing of AI accelerators in OpenStack with Blazar reservations (English)
Taught by
OpenInfra Foundation