Learning with Symmetry and Invariance for Speech Perception

Overview

Explore a comprehensive lecture on symmetry and invariance in speech perception, focusing on computational models and machine learning approaches. Delve into acoustic variability in words and sentences, feedforward models for vision, and invariance in auditory cortex. Examine representations for recognition, bootstrapping techniques for child learning, and statistical learning methods. Investigate data representation, deep representations, and the concept of invariant and selective representations. Learn about group transformations, orbits, and their unique and invariant properties. Discover algorithms and computations for signatures, as well as techniques for learning templates and transformations. Analyze word orbits, isolated word classification, and compare MFCC approaches. Visualize representations and explore segmental phone representations. Study sample complexity in vowel classification, multilayer frame representations, and frame-based phone classification. Examine learnable templates in CNNs, VTL-convolutional networks, and acoustic modeling through HMM state classification. Consider data augmentation techniques and future directions in the field of speech perception and machine learning.

Syllabus

Intro
vision and speech
acoustic variability (words)
acoustic variability (sentences)
feedforward models for vision
invariance in auditory cortex and models
representations for recognition
bootstrapping for learning by children
statistical learning
data representation
deep representations
learning representations
invariant and selective representations
groups and visual transformations
orbit of group transformations
orbit are unique and invariant
invariance via group averaging
transferability
(probabilistic) selectivity
tl;dr summary
algorithms for signatures
computations for signatures
learning templates and transformations
learning by implicit supervision
Word orbits by signal manipulation
word representations
isolated word classification
how many templates and pooling functions?
MFCC comparison (linear kernel)
MFCC comparison (same dimensionality)
MFCC comparison (RBF kernel)
representation visualization
(segmental) phone representations
Sample complexity: vowel classification
multilayer frame representations
frame-based phone classification
learnable templates in CNNS
VTL-convolutional networks
acoustic modeling: HMM state classification
data augmentation?
looking forward

Taught by

MITCBMM

Reviews

Start your review of Learning with Symmetry and Invariance for Speech Perception

Taught by

The Neural Basis of Perceiving Human Visual Social Perception

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

10 Best Deep Learning Courses for 2024

Never Stop Learning.