Flash Attention 2.0 with Tri Dao - Discord Server Talks

Flash Attention 2.0 with Tri Dao - Discord Server Talks

Aleksa Gordić - The AI Epiphany via YouTube Direct link

Brief recap of attention

4 of 9

4 of 9

Brief recap of attention

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Flash Attention 2.0 with Tri Dao - Discord Server Talks

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Main talk starts - intro & motivation
  2. 2 Behind the scenes: how Tri got started with Flash Attention
  3. 3 Motivation: modelling long sequences
  4. 4 Brief recap of attention
  5. 5 Memory bottleneck, IO awareness
  6. 6 Flash Attention 2.0 improvements
  7. 7 Behind the scenes of Flash Attention 2.0 refactor of CUTLASS 3
  8. 8 Future directions
  9. 9 Q&A

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.