Completed
Attention mechanisms
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
MIT: Recurrent Neural Networks
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Sequences in the wild
- 3 A sequence modeling problem: predict the next word
- 4 use a fixed window
- 5 can't model long-term dependencies
- 6 use entire sequence as set of counts
- 7 counts don't preserve order
- 8 use a really big fixed window
- 9 no parameter sharing
- 10 Sequence modeling: design criteria
- 11 Standard feed-forward neural network
- 12 Recurrent neural networks: sequence modeling
- 13 A standard "vanilla" neural network
- 14 A recurrent neural network (RNN)
- 15 RNN state update and output
- 16 RNNs: computational graph across time
- 17 Recall: backpropagation in feed forward models
- 18 RNNs: backpropagation through time
- 19 Standard RNN gradient flow: exploding gradients
- 20 Standard RNN gradient flow:vanishing gradients
- 21 The problem of long-term dependencies
- 22 Trick #1: activation functions
- 23 Trick #2: parameter initialization
- 24 Standard RNN In a standard RNN repeating modules contain a simple computation node
- 25 Long Short Term Memory (LSTMs)
- 26 LSTMs: forget irrelevant information
- 27 LSTMs: output filtered version of cell state
- 28 LSTM gradient flow
- 29 Example task: music generation
- 30 Example task: sentiment classification
- 31 Example task: machine translation
- 32 Attention mechanisms
- 33 Recurrent neural networks (RNNs)