Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

AlpaServe - Statistical Multiplexing with Model Parallelism for Deep Learning Serving

USENIX via YouTube

Overview

Explore a groundbreaking approach to deep learning model serving in this 15-minute conference talk from OSDI '23. Discover how AlpaServe, a novel serving system, leverages model parallelism for statistical multiplexing across multiple devices, even when individual models fit on a single device. Learn about the trade-off between model parallelism overhead and the benefits of statistical multiplexing in reducing serving latency for bursty workloads. Gain insights into AlpaServe's efficient strategy for placing and parallelizing large deep learning models across distributed clusters. Examine evaluation results from production workloads, showcasing AlpaServe's ability to process requests at significantly higher rates and handle increased burstiness while maintaining latency constraints for over 99% of requests.

Syllabus

OSDI '23 - AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Taught by

USENIX

Reviews

Start your review of AlpaServe - Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.