Understanding Medusa: A Framework for LLM Inference Acceleration with Multiple Decoding Heads

Oxen via YouTube Direct link

Speculative Decoding Example

5

of 15

5 of 15

Speculative Decoding Example

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Understanding Medusa: A Framework for LLM Inference Acceleration with Multiple Decoding Heads