Completed
- Experiments
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Rethinking Attention with Performers
Automatically move to the next video in the Classroom when playback concludes
- 1 - Intro & Outline
- 2 - Quadratic Bottleneck in Attention Mechanisms
- 3 - Decomposing the Attention Matrix
- 4 - Approximating the Softmax Kernel
- 5 - Different Choices, Different Kernels
- 6 - Why the Naive Approach does not work!
- 7 - Better Approximation via Positive Features
- 8 - Positive Features are Infinitely Better
- 9 - Orthogonal Features are Even Better
- 10 - Experiments
- 11 - Broader Impact Statement
- 12 - Causal Attention via Prefix Sums
- 13 - Code
- 14 - Final Remarks & Conclusion