Resource Orchestration of HPC on Kubernetes - Where We Are Now and the Journey Ahead
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and progress in enabling High-Performance Computing (HPC) on Kubernetes in this conference talk. Dive into the complexities of resource orchestration for HPC workloads, including NUMA-aware scheduling, advanced resource allocation, and job dependency management. Learn about the efforts of sig-node contributors to address the information disconnect between kubelet and scheduler, implementing a NUMA-aware scheduler and related infrastructure. Gain insights into the feature's development journey, encountered challenges, and the end-to-end solution. Discover the current adoption status, roadmap, and deployment steps for optimizing workload performance. Understand how this work aims to bridge the gap between Kubernetes' widespread use in cloud and enterprise environments and its potential in the HPC domain.
Syllabus
Resource Orchestration of HPC on Kubernetes: Where We Are Now and... Swati Sehgal & Francesco Romani
Taught by
CNCF [Cloud Native Computing Foundation]