Big Bird- Transformers for Longer Sequences

Big Bird- Transformers for Longer Sequences

Yannic Kilcher via YouTube Direct link

- Random Attention

4 of 13

4 of 13

- Random Attention

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Big Bird- Transformers for Longer Sequences

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Intro & Overview
  2. 2 - Quadratic Memory in Full Attention
  3. 3 - Architecture Overview
  4. 4 - Random Attention
  5. 5 - Window Attention
  6. 6 - Global Attention
  7. 7 - Architecture Summary
  8. 8 - Theoretical Result
  9. 9 - Experimental Parameters
  10. 10 - Structured Block Computations
  11. 11 - Recap
  12. 12 - Experimental Results
  13. 13 - Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.