Matrix Calculus for Machine Learning and Beyond

Overview

We all know that calculus courses such as [*18.01 Single Variable Calculus*](https://ocw.mit.edu/courses/18-01sc-single-variable-calculus-fall-2010/) and [*18.02 Multivariable Calculus*](https://ocw.mit.edu/courses/18-02sc-multivariable-calculus-fall-2010/) cover univariate and vector calculus, respectively. Modern applications such as machine learning and large-scale optimization require the next big step, "matrix calculus" and calculus on arbitrary vector spaces. This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be reimagined in large-scale computing.

Syllabus

Lecture 1 Part 1: Introduction and Motivation
Lecture 1 Part 2: Derivatives as Linear Operators
Lecture 2 Part 1: Derivatives in Higher Dimensions: Jacobians and Matrix Functions
Lecture 2 Part 2: Vectorization of Matrix Functions
Lecture 3 Part 1: Kronecker Products and Jacobians
Lecture 3 Part 2: Finite-Difference Approximations
Lecture 4 Part 1: Gradients and Inner Products in Other Vector Spaces
Lecture 4 Part 2: Nonlinear Root Finding, Optimization, and Adjoint Gradient Methods
Lecture 5 Part 1: Derivative of Matrix Determinant and Inverse
Lecture 5 Part 2: Forward Automatic Differentiation via Dual Numbers
Lecture 5 Part 3: Differentiation on Computational Graphs
Lecture 6 Part 1: Adjoint Differentiation of ODE Solutions
Lecture 6 Part 2: Calculus of Variations and Gradients of Functionals
Lecture 7 Part 1: Derivatives of Random Functions
Lecture 7 Part 2: Second Derivatives, Bilinear Forms, and Hessian Matrices
Lecture 8 Part 1: Derivatives of Eigenproblems
Lecture 8 Part 2: Automatic Differentiation on Computational Graphs