Completed
- Caveats with the RNN connection
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Transformers Are RNNs- Fast Autoregressive Transformers With Linear Attention
Automatically move to the next video in the Classroom when playback concludes
- 1 - Intro & Overview
- 2 - Softmax Attention & Transformers
- 3 - Quadratic Complexity of Softmax Attention
- 4 - Generalized Attention Mechanism
- 5 - Kernels
- 6 - Linear Attention
- 7 - Experiments
- 8 - Intuition on Linear Attention
- 9 - Connecting Autoregressive Transformers and RNNs
- 10 - Caveats with the RNN connection
- 11 - More Results & Conclusion