Rethinking Attention with Performers

Rethinking Attention with Performers

Yannic Kilcher via YouTube Direct link

- Code

13 of 14

13 of 14

- Code

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Rethinking Attention with Performers

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Intro & Outline
  2. 2 - Quadratic Bottleneck in Attention Mechanisms
  3. 3 - Decomposing the Attention Matrix
  4. 4 - Approximating the Softmax Kernel
  5. 5 - Different Choices, Different Kernels
  6. 6 - Why the Naive Approach does not work!
  7. 7 - Better Approximation via Positive Features
  8. 8 - Positive Features are Infinitely Better
  9. 9 - Orthogonal Features are Even Better
  10. 10 - Experiments
  11. 11 - Broader Impact Statement
  12. 12 - Causal Attention via Prefix Sums
  13. 13 - Code
  14. 14 - Final Remarks & Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.