Overview
Syllabus
Intro
An Example Prediction Problem: Sentence Classification
A First Try: Bag of Words (BOW)
Continuous Bag of Words (CBOW) movie
What do Our Vectors Represent?
Why Bag of n-grams?
What Problems w/ Bag of n-grams?
Time Delay Neural Networks (Waibel et al. 1989)
Convolutional Networks (LeCun et al. 1997)
Standard conv2d Function
Stacked Convolution
Dilated Convolution (e.g. Kalchbrenner et al. 2016)
An Aside: Nonlinear Functions • Proper choice of a non-linear function is essential in stacked networks
Why (Dilated) Convolution for Modeling Sentences? • In contrast to recurrent neural networks (next class)
Example: Dependency Structure
Why Model Sentence Pairs?
Siamese Network (Bromley et al. 1993)
Taught by
Graham Neubig