Overview
Syllabus
Intro
Why Block Coordinate Descent?
Block Coordinate Descent for Large-Scale Optimization
Why use coordinate descent?
Problems Suitable for Coordinate Descent
Cannonical Randomized BCD Algorithm
Better Block Selection Rules
Gauss-Southwell???
Fixed Blocks vs. Variable Blocks
Greedy Rules with Gradient Updates
Gauss-Southwell-Lipschitz vs. Maximum Improvement Rule
Newton-Steps and Quadratic-Norms
Gauss-Southwell-Quadratic Rule
Matrix vs. Newton Updates
Newton's Method vs. Cubic Regularization
Experiment: Multi-class Logistic Regression
Superlinear Convergence?
Optimization with Bound Constraints
Manifold Identification Property
Superlinear Convergence and Proximal-Newton
Message-Passing for Sparse Quadratics
Experiment: Sparse Quadratic Problem
Summary
Taught by
Simons Institute