Overview
Syllabus
Intro
vision and speech
acoustic variability (words)
acoustic variability (sentences)
feedforward models for vision
invariance in auditory cortex and models
representations for recognition
bootstrapping for learning by children
statistical learning
data representation
deep representations
learning representations
invariant and selective representations
groups and visual transformations
orbit of group transformations
orbit are unique and invariant
invariance via group averaging
transferability
(probabilistic) selectivity
tl;dr summary
algorithms for signatures
computations for signatures
learning templates and transformations
learning by implicit supervision
Word orbits by signal manipulation
word representations
isolated word classification
how many templates and pooling functions?
MFCC comparison (linear kernel)
MFCC comparison (same dimensionality)
MFCC comparison (RBF kernel)
representation visualization
(segmental) phone representations
Sample complexity: vowel classification
multilayer frame representations
frame-based phone classification
learnable templates in CNNS
VTL-convolutional networks
acoustic modeling: HMM state classification
data augmentation?
looking forward
Taught by
MITCBMM