Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets

Yannic Kilcher via YouTube

Overview

Limited-Time Offer: Up to 75% Off Coursera Plus!
7000+ certificate courses from Google, Microsoft, IBM, and many more.
This course explores the phenomenon of grokking in neural networks on small algorithmic datasets, where networks transition from random chance generalization to perfect generalization suddenly. The course delves into the emergence of underlying binary operations in learned latent spaces. The syllabus covers topics such as the grokking phenomenon, double descent, influence factors on grokking, smoothness, simplicity in explanations, and the role of weight decay. The intended audience for this course includes individuals interested in deep learning, neural networks, and generalization in overparametrized models.

Syllabus

- Intro & Overview
- The Grokking Phenomenon
- Related: Double Descent
- Binary Operations Datasets
- What quantities influence grokking?
- Learned Emerging Structure
- The role of smoothness
- Simple explanations win
- Why does weight decay encourage simplicity?
- Appendix
- Conclusion & Comments

Taught by

Yannic Kilcher

Reviews

Start your review of Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.