Completed
USENIX ATC '24 - Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric...
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Quant-LLM: Accelerating Large Language Model Serving via FP6-Centric Algorithm-System Co-Design
Automatically move to the next video in the Classroom when playback concludes