Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Power-aware Deep Learning Model Serving with μ-Serve

USENIX via YouTube

Overview

Explore power-aware deep learning model serving with μ-Serve in this 21-minute conference talk from USENIX ATC '24. Discover how researchers from the University of Illinois Urbana-Champaign and IBM Research address the challenge of reducing energy consumption in model-serving clusters while maintaining performance requirements. Learn about the benefits of GPU frequency scaling for power saving in model serving and the importance of co-designing fine-grained model multiplexing with GPU frequency scaling. Examine μ-Serve, a novel power-aware model-serving system that optimizes power consumption and performance for serving multiple ML models in a homogeneous GPU cluster. Gain insights into evaluation results showing significant power savings through dynamic GPU frequency scaling without compromising service level objectives.

Syllabus

USENIX ATC '24 - Power-aware Deep Learning Model Serving with μ-Serve

Taught by

USENIX

Reviews

Start your review of Power-aware Deep Learning Model Serving with μ-Serve

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.