Bloomberg's Journey to Improve Resource Utilization in a Multi-Cluster Platform
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Learn how Bloomberg optimizes resource utilization in their on-premises Data Science Platform through this technical conference talk. Explore the challenges and solutions of managing GPU workloads across multiple Kubernetes clusters in different data centers, with a focus on the collaboration between Bloomberg's DSP team and the Karmada community. Discover practical approaches to intelligent GPU workload scheduling, implementation of Kubernetes Custom Resource Definitions in multi-cluster environments, development of a highly available Karmada control plane, and creation of a streamlined training job submission interface. Gain valuable insights into managing large-scale, heterogeneous GPU environments while improving resource utilization and reducing operational costs.
Syllabus
Bloomberg’s Journey to Improve Resource Utilization in a Multi-Cluster Platf... Yao Weng & Leon Zhou
Taught by
CNCF [Cloud Native Computing Foundation]