Completed
Shuffling the Training Data
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Neural Nets for NLP 2020 - Language Modeling, Efficiency/Training Tricks
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Language Modeling: Calculating
- 3 Count-based Language Models
- 4 A Refresher on Evaluation
- 5 Problems and Solutions? • Cannot share strength among similar words
- 6 Example
- 7 Softmax
- 8 A Computation Graph View
- 9 A Note: "Lookup"
- 10 Training a Model
- 11 Parameter Update
- 12 Unknown Words
- 13 Evaluation and Vocabulary
- 14 Linear Models can't Learn Feature Combinations
- 15 Neural Language Models (See Bengio et al. 2004)
- 16 Tying Input/Output Embeddings
- 17 Standard SGD
- 18 SGD With Momentum
- 19 Adagrad
- 20 Adam . Most standard optimization option in NLP and beyond . Considers rolling average of gradient, and momentum
- 21 Shuffling the Training Data