From Classical Statistics to Modern ML - The Lessons of Deep Learning - Mikhail Belkin
Institute for Advanced Study via YouTube
Overview
Syllabus
Intro
Empirical Risk Minimization
The ERM/SRM theory of learning
Uniform laws of large numbers
Capacity control
U-shaped generalization curve
Does interpolation overfit?
Interpolation does not overfit even for very noisy data
why bounds fail
Interpolation is best practice for deep learning
Historical recognition
where we are now: the key lesson
Generalization theory for interpolation?
Interpolated k-NN schemes
Interpolation and adversarial examples
"Double descent" risk curve
Random Fourier networks
what is the mechanism?
Is infinite width optimal?
Smoothness by averaging
Double Descent in Random Feature settings
Framework for modern ML
The landscape of generalization
optimization: classical
The power of interpolation
Learning from deep learning: fast and effective kernel machines
Points and lessons
Taught by
Institute for Advanced Study