Completed
- Quadratic Bottleneck in Attention Mechanisms
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Rethinking Attention with Performers
Automatically move to the next video in the Classroom when playback concludes
- 1 - Intro & Outline
- 2 - Quadratic Bottleneck in Attention Mechanisms
- 3 - Decomposing the Attention Matrix
- 4 - Approximating the Softmax Kernel
- 5 - Different Choices, Different Kernels
- 6 - Why the Naive Approach does not work!
- 7 - Better Approximation via Positive Features
- 8 - Positive Features are Infinitely Better
- 9 - Orthogonal Features are Even Better
- 10 - Experiments
- 11 - Broader Impact Statement
- 12 - Causal Attention via Prefix Sums
- 13 - Code
- 14 - Final Remarks & Conclusion