Deep Learning Applications by Rina Panigrahy
International Centre for Theoretical Sciences via YouTube
Overview
Syllabus
Statistical Physics Methods in Machine Learning
Deep Learning Applications
Tutorial: Deep Learning
Outline
Learning an unknown function
Learning an unknown function: like curve fitting
Learning a function: why?
Learning a function: How
Linear Regression: Line fitting
Minimize errorloss in prediction
Loss measures error in prediction
Gradient descent
Learning a function: Linear Regression x
Gradient update: BackPropagation.
Stochastic Gradient Descent: gradients over a few examples at a time.
Learning a function: Sigmoid, sign
Sigmoid, RELU
Logistic regression uses logloss
Deep Network. Allows rich representation Can express any function/circuit
Neurons
Network of Neurons
Hierarchical representation of Objects
Training w: SGD to Minimize loss
Backpropagation: Gradient Descent for one example
Softmax for multiclass output: just like max
Convergence of Gradient Descent for Model training
Applications
MNIST
Convolution and Pooling
Gradient-Based Learning Applied to Document Recognition
Goal
ImageNet
ILSVRC
Architecture
RELU Nonlinearity
96 Convolutional Kernels
Phone recognition on the TIMIT benchmark Mohamed, Dahl, & Hinton,
Word error rates from MSR, IBM, & Google Hinton et. al. IEEE signal Processing Magazine, Nov 2012
Speech recognition
RNN
Videos/tutorials on Deep learning applications
Theoretical Understanding? - Deep Learning
Nonconvex Optimization
Low rank Approximation
No local minima in linear networks [Kawaguchi, NIPS 16, Ge et al, ICML 17]
Deep Learning
Does well experimentally
With simplifications, our target functions f are...
Overview of Results
Q&A
Taught by
International Centre for Theoretical Sciences