Co-Location of CPU and GPU Workloads for High Resource Efficiency in Kubernetes

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore strategies for optimizing resource utilization in Kubernetes clusters by co-locating CPU and GPU workloads. Learn how Ant Financial and Alibaba achieved a 10% increase in utilization through innovative approaches. Discover the creation of a new QoS class, implementation of node-level cgroups for batch jobs, and use of PodGroup CRD for gang scheduling. Gain insights into building and managing a co-location cluster with over 100 GPU and 500 CPU nodes, effectively combining long-running services and AI batch jobs. This 37-minute conference talk from the Linux Foundation provides valuable experience and practices for maximizing resource efficiency in Kubernetes environments.

Syllabus

Co-Location of CPU and GPU Workloads with High Resource Efficiency - Penghao Cen & Jian He

Taught by

Linux Foundation

Reviews

Start your review of Co-Location of CPU and GPU Workloads for High Resource Efficiency in Kubernetes

Taught by

Tags

Coordinate Workloads Colocation: QoS-Oriented Scheduling Enhancement on Kubernetes

Improving GPU Utilization and Accelerating Model Training with Kubernetes Scheduling Framework and NRI

A Hybrid Container Cloud With Kubernetes and Hadoop YARN

Precision Matters: Scheduling GPU Workloads on Kubernetes

Maximizing GPU Utilization Over Multi-Cluster - Challenges and Solutions for Cloud-Native AI Platform

Minimizing GPU Cost for Deep Learning on Kubernetes

9 Best Kubernetes Courses for 2024

Never Stop Learning.