Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets

Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets

Yannic Kilcher via YouTube Direct link

- Intro & Overview

1 of 11

1 of 11

- Intro & Overview

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Intro & Overview
  2. 2 - The Grokking Phenomenon
  3. 3 - Related: Double Descent
  4. 4 - Binary Operations Datasets
  5. 5 - What quantities influence grokking?
  6. 6 - Learned Emerging Structure
  7. 7 - The role of smoothness
  8. 8 - Simple explanations win
  9. 9 - Why does weight decay encourage simplicity?
  10. 10 - Appendix
  11. 11 - Conclusion & Comments

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.