MobileLLM - Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
EDGE AI FOUNDATION via YouTube
Overview
Watch a 20-minute research presentation exploring the development of MobileLLM, a groundbreaking approach to deploying efficient large language models on mobile devices. Learn how deep and thin architectures, embedding sharing, and grouped-query attention mechanisms enable high-performance language models with fewer than a billion parameters. Discover how these optimizations achieve significant accuracy improvements over previous state-of-the-art models in commonsense reasoning tasks, with 2.7% and 4.3% boosts for 125M and 350M parameter models respectively. Understand how this architectural innovation challenges the conventional wisdom that data and parameter quantity are the primary drivers of model quality, while demonstrating comparable performance to much larger models in practical applications like API calling tasks.
Syllabus
GenAI on the Edge Forum: MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device..
Taught by
EDGE AI FOUNDATION