Scaling Transformer to 1M Tokens and Beyond with RMT - Paper Explained

Scaling Transformer to 1M Tokens and Beyond with RMT - Paper Explained

Yannic Kilcher via YouTube Direct link

- Conclusion

6 of 6

6 of 6

- Conclusion

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Scaling Transformer to 1M Tokens and Beyond with RMT - Paper Explained

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Intro
  2. 2 - Transformers on long sequences
  3. 3 - Tasks considered
  4. 4 - Recurrent Memory Transformer
  5. 5 - Experiments on scaling and attention maps
  6. 6 - Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.