Beyond Lazy Training for Over-parameterized Tensor Decomposition
Institute for Pure & Applied Mathematics (IPAM) via YouTube
Overview
Syllabus
Intro
Tensor (CP) decomposition
Why naïve algorithm fails
Why gradient descent?
Two-Layer Neural Network
Form of the objective
Difficulties of analyzing gradient descent
Lazy training fails
O is a high order saddle point
Our (high level) algorithm
Proof ideas
Iterates remain close to correct subspace
Escaping local minima by random correlation
Amplify initial correlation by tensor power method
Conclusions and Open Problems
Taught by
Institute for Pure & Applied Mathematics (IPAM)