We all know that calculus courses such as [*18.01 Single Variable Calculus*](https://ocw.mit.edu/courses/18-01sc-single-variable-calculus-fall-2010/) and [*18.02 Multivariable Calculus*](https://ocw.mit.edu/courses/18-02sc-multivariable-calculus-fall-2010/) cover univariate and vector calculus, respectively. Modern applications such as machine learning and large-scale optimization require the next big step, "matrix calculus" and calculus on arbitrary vector spaces.
This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be reimagined in large-scale computing.
Matrix Calculus for Machine Learning and Beyond
Massachusetts Institute of Technology via MIT OpenCourseWare
Overview
Syllabus
- Lecture 1 Part 1: Introduction and Motivation
- Lecture 1 Part 2: Derivatives as Linear Operators
- Lecture 2 Part 1: Derivatives in Higher Dimensions: Jacobians and Matrix Functions
- Lecture 2 Part 2: Vectorization of Matrix Functions
- Lecture 3 Part 1: Kronecker Products and Jacobians
- Lecture 3 Part 2: Finite-Difference Approximations
- Lecture 4 Part 1: Gradients and Inner Products in Other Vector Spaces
- Lecture 4 Part 2: Nonlinear Root Finding, Optimization, and Adjoint Gradient Methods
- Lecture 5 Part 1: Derivative of Matrix Determinant and Inverse
- Lecture 5 Part 2: Forward Automatic Differentiation via Dual Numbers
- Lecture 5 Part 3: Differentiation on Computational Graphs
- Lecture 6 Part 1: Adjoint Differentiation of ODE Solutions
- Lecture 6 Part 2: Calculus of Variations and Gradients of Functionals
- Lecture 7 Part 1: Derivatives of Random Functions
- Lecture 7 Part 2: Second Derivatives, Bilinear Forms, and Hessian Matrices
- Lecture 8 Part 1: Derivatives of Eigenproblems
- Lecture 8 Part 2: Automatic Differentiation on Computational Graphs
Taught by
Prof. Alan Edelman and Prof. Steven G. Johnson