Completed
Why do Neural Networks work better?
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Deep Learning Meets Nonparametric Regression: Are Weight-decayed DNNs Locally Adaptive?
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 From the statistical point of view, the success of DNN is a mystery.
- 3 Why do Neural Networks work better?
- 4 The "adaptivity" conjecture
- 5 NTKs are strictly suboptimal for locally adaptive nonparametric regression
- 6 Are DNNs locally adaptive? Can they achieve optimal rates for TV-classes/Besov classes?
- 7 Background: Splines are piecewise polynomials
- 8 Background: Truncated power basis for splines
- 9 Weight decay = Total Variation Regularization
- 10 Weight decayed L-Layer PNN is equivalent to Sparse Linear Regression with learned basis functions
- 11 Main theorem: Parallel ReLU DNN approaches the minimax rates as it gets deeper.
- 12 Comparing to classical nonparametric regression methods
- 13 Examples of Functions with Heterogeneous Smoothness
- 14 Step 2: Approximation Error Bound
- 15 Summary of take-home messages