Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Infinite Memory Transformer - Research Paper Explained

Yannic Kilcher via YouTube

Overview

Explore the groundbreaking ∞-former (Infinity-Former) model in this comprehensive video explanation of a research paper. Dive into how this innovative approach extends vanilla Transformers with an unbounded long-term memory, allowing for processing of arbitrarily long sequences. Learn about the continuous attention mechanism that enables attention complexity independent of context length, and discover the concept of "sticky memories" for highlighting important past events. Follow along as the video breaks down the problem statement, architecture, and experimental results, including applications in language modeling. Gain insights into the pros and cons of using heuristics and understand how this model addresses long-range dependencies in sequence tasks.

Syllabus

- Intro & Overview
- Sponsor Spot: Weights & Biases
- Problem Statement
- Continuous Attention Mechanism
- Unbounded Memory via concatenation & contraction
- Does this make sense?
- How the Long-Term Memory is used in an attention layer
- Entire Architecture Recap
- Sticky Memories by Importance Sampling
- Commentary: Pros and cons of using heuristics
- Experiments & Results

Taught by

Yannic Kilcher

Reviews

Start your review of Infinite Memory Transformer - Research Paper Explained

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.