Deep Learning Meets Nonparametric Regression: Are Weight-decayed DNNs Locally Adaptive?

Overview

Explore the intersection of deep learning and nonparametric regression in this 56-minute conference talk presented by Yu-Xiang Wang at USC Information Sciences Institute. Delve into the capabilities of deep neural networks (DNNs) in curve fitting compared to classical tools like splines and wavelets. Gain insights on why DNNs outperform kernels, the advantages of deep networks over shallow ones, the significance of ReLU activation, generalization under overparameterization, the role of sparsity in deep learning, and the validity of the lottery ticket hypothesis. Examine the statistical perspective on DNNs' success, explore the "adaptivity" conjecture, and understand how weight-decayed DNNs relate to total variation regularization. Learn about the equivalence between weight-decayed L-Layer PNNs and sparse linear regression with learned basis functions. Discover how parallel ReLU DNNs approach minimax rates as they deepen, and compare their performance to classical nonparametric regression methods. Investigate examples of functions with heterogeneous smoothness and approximation error bounds.

Syllabus

Intro
From the statistical point of view, the success of DNN is a mystery.
Why do Neural Networks work better?
The "adaptivity" conjecture
NTKs are strictly suboptimal for locally adaptive nonparametric regression
Are DNNs locally adaptive? Can they achieve optimal rates for TV-classes/Besov classes?
Background: Splines are piecewise polynomials
Background: Truncated power basis for splines
Weight decay = Total Variation Regularization
Weight decayed L-Layer PNN is equivalent to Sparse Linear Regression with learned basis functions
Main theorem: Parallel ReLU DNN approaches the minimax rates as it gets deeper.
Comparing to classical nonparametric regression methods
Examples of Functions with Heterogeneous Smoothness
Step 2: Approximation Error Bound
Summary of take-home messages

Taught by

USC Information Sciences Institute

Reviews

Start your review of Deep Learning Meets Nonparametric Regression: Are Weight-decayed DNNs Locally Adaptive?

Taught by

Neural Networks Meet Nonparametric Regression: Generalization by Weight Decay and Large Learning Rates

10 Best Deep Learning Courses for 2024

Never Stop Learning.