Overview
Explore a framework inspired by random matrix theory for analyzing the dynamics of stochastic optimization algorithms in high-dimensional settings. Delve into the deterministic nature of optimization algorithm dynamics on generalized linear models and multi-index problems with random data when both sample size and dimensions are large. Examine how the limiting dynamics for stochastic algorithms are governed by an ODE. Investigate the implicit conditioning ratio (ICR) in the least square setting, which regulates SGD+M's ability to accelerate. Learn about the convergence rates of SGD+M in relation to batch sizes and the ICR. Discover explicit choices for learning rate and momentum parameters based on Hessian spectra to achieve optimal performance. Gain insights into how this model aligns with real-world dataset performances in this 56-minute lecture by Courtney Paquette from McGill University, presented at the Simons Institute as part of the Optimization and Algorithm Design series.
Syllabus
Hitting the High-D(imensional) Notes: An ODE for SGD learning dynamics in high-dimensions
Taught by
Simons Institute