RecurrentGemma: Moving Past Transformers with Griffin Architecture for Long Context Length

Overview

Explore a comprehensive technical video that delves into Google's groundbreaking RecurrentLLM architecture with Griffin, presenting a significant shift from traditional transformer-based models. Learn about the innovative RecurrentGemma-2B model, which achieves an impressive throughput of 6000 tokens per second while maintaining performance comparable to transformer-based Gemma 2B. Discover the technical intricacies of new architectures like GRIFFIN and HAWK, with detailed explanations of their advantages over State Space Models such as Mamba-S6. Master concepts including local attention mechanisms, linear recurrences, GRU (Gated Recurrent Unit), LRU (Linear Recurrent Unit), and RG-LRU (Real-Gated Linear Recurrent Unit). Gain insights into the model's fixed-size state architecture, which offers superior memory efficiency for long sequences compared to traditional transformer models' growing key-value cache. Examine performance benchmarks, practical implementations through Github code examples, and understand how this architectural innovation maintains high throughput regardless of sequence length while requiring 33% fewer training tokens than its transformer counterpart.

Syllabus

Llama 3 inference and finetuning
New Language Model Dev
Local Attention
Linear complexity of RNN
Gated recurrent unit - GRU
Linear recurrent Unit - LRU
GRIFFIN architecture
Real-Gated Linear recurrent unit RG-LRU
Griffin Key Features
RecurrentGemma
Github code
Performance benchmark

Taught by

Discover AI

Reviews

Start your review of RecurrentGemma: Moving Past Transformers with Griffin Architecture for Long Context Length

Taught by

Advanced RNN Concepts and Projects

Mastering Generative AI: Language Models with Transformers

NVIDIA HYMBA: A Hybrid-Head Architecture for Small Language Models with MetaTokens

Ring Attention and Blockwise Transformers for Extended Context Length in Language Models

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.