Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Discover AI via YouTube

Overview

Explore a technical video presentation detailing Google's innovative Infini-attention transformer architecture, designed to handle context lengths of up to 1 million tokens. Learn about the integration of compressive memory components within vanilla attention mechanisms, allowing models to store and retrieve historical key-value states efficiently. Understand the technical challenges and solutions around information compression, implementation complexity, and performance optimization. Dive into detailed mathematical explanations of memory updates and retrieval processes, benchmark data analysis, and explore the relationship between Infini-attention and internal RAG systems. The presentation concludes with insights into TransformerFAM with Feedback attention and includes a simplified summary for beginners. Based on the research paper "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention," this comprehensive breakdown covers everything from basic concepts to advanced mathematical implementations.

Syllabus

Infinite context length of LLM
INFINI paper by Google
Matrix Memory of limited size
Update memory simple
Retrieve memory simple
Update memory maths
Retrieve memory maths
Infini attention w/ internal RAG?
Benchmark data
Summary for green grasshoppers
TransformerFAM w/ Feedback attention

Taught by

Discover AI

Reviews

Start your review of INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.