Can Learning Theory Resist Deep Learning? - Francis Bach, INRIA

Overview

Explore the challenges of applying learning theory to deep learning in this 43-minute conference talk by Francis Bach from INRIA. Delve into recent results on global convergence of gradient descent for specific non-convex optimization problems, highlighting the difficulties and pitfalls encountered when analyzing deep learning algorithms. Examine the constant exchanges between theory and practice in machine learning, and investigate why these exchanges become more complex in the realm of deep learning. Gain insights into the intersection of statistics and computer science in modern machine learning, covering topics such as parametric supervised learning, convex optimization, stochastic gradient descent, and the theoretical analysis of deep neural networks.

Syllabus

Intro
Scientific context
Parametric supervised machine learning
Convex optimization problems
Exponentially convergent SGD for smooth finite sums
Exponentially convergent SGD for finite sums
Convex optimization for machine learning
Theoretical analysis of deep learning
Optimization for multi-layer neural networks
Gradient descent for a single hidden layer
Optimization on measures
Many particle limit and global convergence (Chizat and Bach, 2018a)
Simple simulations with neural networks
From qualitative to quantitative results ?
Lazy training (Chizat and Bach, 2018)
From lazy training to neural tangent kernel
Are state-of-the-art neural networks in the lazy regime?
Is the neural tangent kernel useful in practice?
Can learning theory resist deep learning?