Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Yannic Kilcher via YouTube

Overview

Explore a comprehensive video lecture on the Mamba architecture, a novel approach to linear-time sequence modeling using selective state spaces. Delve into the comparison between Transformers, RNNs, and S4 models before examining state space models and their selective variants. Analyze the Mamba architecture in detail, including its SSM layer and forward propagation techniques. Discover how the model utilizes GPU memory hierarchy and achieves efficient computation through prefix sums and parallel scans. Review experimental results, gain insights from the presenter's comments, and conclude with a brief examination of the underlying code. Enhance your understanding of this cutting-edge approach to sequence modeling that outperforms Transformers in various modalities while offering faster inference and linear scaling in sequence length.

Syllabus

- Introduction
- Transformers vs RNNs vs S4
- What are state space models?
- Selective State Space Models
- The Mamba architecture
- The SSM layer and forward propagation
- Utilizing GPU memory hierarchy
- Efficient computation via prefix sums / parallel scans
- Experimental results and comments
- A brief look at the code

Taught by

Yannic Kilcher

Reviews

Start your review of Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.