Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Ring Attention and Blockwise Transformers for Extended Context Length in Language Models

Discover AI via YouTube

Overview

Explore a technical video lecture that delves into Ring Attention, a breakthrough technology enabling context lengths of 1 million tokens in Large Language Models (LLMs) and Vision Language Models (VLMs). Learn about the Block Parallel Transformer concept developed at UC Berkeley, from theoretical foundations to practical implementation. Understand the three approaches to achieving infinite context lengths, the mechanics of Q, K, V operations in libraries, and the mathematical principles behind blockwise parallel transformers. Examine ring attention symmetries, detailed explanations of ring attention mechanisms, and their implementation in JAX code. Discover how this technology is being applied in real-world applications like Google's Gemini 1.5 Pro on Vertex AI, and get insights into future developments with Google's Infini Attention. The comprehensive breakdown includes practical code examples and mathematical explanations, making complex concepts accessible to technical audiences interested in advancing their understanding of attention mechanisms in AI models.

Syllabus

3 ways for infinite context lengths
Blockwise Parallel Transformers
Q, K, V explained in a library
BPT explained in a library
Maths for blockwise parallel transformers
Ring attention symmetries
Ring attention explained
Ring attention JAX code
Outlook: Infini Attention by Google

Taught by

Discover AI

Reviews

Start your review of Ring Attention and Blockwise Transformers for Extended Context Length in Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.