Overview
Syllabus
Intro
A First Try: Bag of Words (BOW)
Continuous Bag of Words (CBOW) this movie
What do Our Vectors Represent?
Bag of n-grams hate
Why Bag of n-grams?
2-dimensional Convolutional Networks
CNNs for Sentence Modeling
Standard conv2d Function
Padding
Striding
Pooling . Pooling is like convolution, but calculates some reduction function feature-wise • Max pooling: "Did you see this feature anywhere in the range?" (most common) • Average pooling: How prevalent is this feature over the entire range
Stacked Convolution
Dilated Convolution (e.g. Kalchbrenner et al. 2016) . Gradually increase stride every time step (na reduction in length) sentence
Iterated Dilated Convolution (Strubell+2017) . Multiple iterations of the same stack of dilated convolutions
Non-linear Functions
Which Non-linearity Should I Use?
Taught by
Graham Neubig