Transformers Are RNNs- Fast Autoregressive Transformers With Linear Attention

Transformers Are RNNs- Fast Autoregressive Transformers With Linear Attention

Yannic Kilcher via YouTube Direct link

- Connecting Autoregressive Transformers and RNNs

9 of 11

9 of 11

- Connecting Autoregressive Transformers and RNNs

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Transformers Are RNNs- Fast Autoregressive Transformers With Linear Attention

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Intro & Overview
  2. 2 - Softmax Attention & Transformers
  3. 3 - Quadratic Complexity of Softmax Attention
  4. 4 - Generalized Attention Mechanism
  5. 5 - Kernels
  6. 6 - Linear Attention
  7. 7 - Experiments
  8. 8 - Intuition on Linear Attention
  9. 9 - Connecting Autoregressive Transformers and RNNs
  10. 10 - Caveats with the RNN connection
  11. 11 - More Results & Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.