Optimizing Knowledge Distillation Training With Volcano
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Explore a conference talk on optimizing knowledge distillation training using Volcano. Delve into the innovative approach of leveraging Volcano as a scheduler to deploy Teacher models in online Kubernetes GPU inference card clusters, enhancing the throughput of knowledge distillation processes. Learn how this method allows for flexible scheduling, mitigating task failures during peak hours and maximizing the use of cluster resources. Discover the detailed process of optimizing elastic distillation training with Volcano, complete with benchmark data. Gain insights into large-scale training, Elastic Deep Learning, and the advantages of this approach. Examine the Volcano architecture, GPU sharing techniques, and its integration with Kubernetes for efficient model compression and deployment.
Syllabus
Introduction
Project Background
Large Scale Training
Elastic Deep Learning
Knowledge Distillation
Advantages
Training Vector
William Wang
Challenges
CNCF Sandbox
Volcano Architecture
Survival Kubernetes
Volcano Job
GPU Sharing
Cromwell
Commander
Kubernetes
Taught by
CNCF [Cloud Native Computing Foundation]