Completed
Introduction
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Flash Attention Explained - Algorithm, Applications, and Performance
Automatically move to the next video in the Classroom when playback concludes
- 1 Introduction
- 2 Flash Attention
- 3 Motivation for Flash Attention
- 4 Downstream Applications
- 5 Histopathology
- 6 Outline
- 7 Attention
- 8 Memory Footprint
- 9 GPU Memory
- 10 Memory Footprint Reduction
- 11 Approximate Attention
- 12 FlashAttention
- 13 Sparsity Fraction
- 14 Empirical Validation
- 15 Benchmarks
- 16 Other Applications
- 17 Long Document Classification
- 18 Path X Benchmark
- 19 Hungry Hungry Hippos
- 20 Simple Hardware Efficient Long Convolutions
- 21 Summary
- 22 Question
- 23 State Space Representation
- 24 Loop Order
- 25 Speed vs Sequence Length
- 26 Hardware vs Algorithms
- 27 Hardware Software Codesign
- 28 Tensor Cores