Completed
Idea: Study Dynamics of the Prediction
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
On the Foundations of Deep Learning - SGD, Overparametrization, and Generalization
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Fundamental Questions
- 3 Challenges
- 4 What if the Landscape is Bad?
- 5 Gradient Descent Finds Global Minima
- 6 Idea: Study Dynamics of the Prediction
- 7 Local Geometry
- 8 Local vs Global Geometry
- 9 What about Generalization Error?
- 10 Does Overparametrization Hurt Generalization?
- 11 Background on Margin Theory
- 12 Max Margin via Logistic Loss
- 13 Intuition
- 14 Overparametrization Improves the Margin
- 15 Optimization with Regularizer
- 16 Comparison to NTK
- 17 Is Regularization Needed?
- 18 Warmup: Logistic Regression
- 19 What's Special About Gradient Descent?
- 20 Changing the Geometry: Steepest Descent
- 21 Steepest Descent: Examples
- 22 Beyond Linear Models: Deep Networks
- 23 Implicit Regularization: NTK vs Asymptotic
- 24 Does Architecture Matter?
- 25 Example: Changing the Depth in Linear Network
- 26 Example: Depth in Linear Convolutional Network
- 27 Random Thoughts