Completed
- Memory Efficiency of vLLM
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Finetuning, Serving, and Evaluating Large Language Models in the Wild
Automatically move to the next video in the Classroom when playback concludes
- 1 Welcome to the world of the world of large language models with Dr. Hao Zhang postdoctoral researcher at the Sky Lab, UC Berkeley. In this talk, Finetuning, Serving, and Evaluating LLMs in the Wild, …
- 2 - Introductions
- 3 - Background
- 4 - An Example
- 5 - Chatbot Arena: Deployment & Elo-based Leaderboard
- 6 - Today’s Focus: Behind the Scene
- 7 - Key Insight
- 8 - vLLM: Efficient Memory Management for LLM Inference
- 9 - Memory Efficiency of vLLM
- 10 - vLLM Open-Source Adoption
- 11 - Key Idea