Neural Nets for NLP 2020 - Language Modeling, Efficiency/Training Tricks

Overview

Explore a comprehensive lecture on language modeling and neural network training techniques for natural language processing. Delve into feed-forward neural network language models, methods to prevent overfitting, and mini-batching techniques. Learn about automatic optimization, including automatic minibatching and code-level optimization, as well as various optimizers. Discover how to measure language model performance using accuracy, likelihood, and perplexity metrics. Gain insights into handling unknown words, evaluation strategies, and vocabulary considerations. Examine the limitations of linear models and explore neural language models, including input/output embedding tying. Study different optimization algorithms such as standard SGD, SGD with momentum, Adagrad, and Adam. Understand the importance of shuffling training data for improved model performance.

Syllabus

Intro
Language Modeling: Calculating
Count-based Language Models
A Refresher on Evaluation
Problems and Solutions? • Cannot share strength among similar words
Example
Softmax
A Computation Graph View
A Note: "Lookup"
Training a Model
Parameter Update
Unknown Words
Evaluation and Vocabulary
Linear Models can't Learn Feature Combinations
Neural Language Models (See Bengio et al. 2004)
Tying Input/Output Embeddings
Standard SGD
SGD With Momentum
Adagrad
Adam . Most standard optimization option in NLP and beyond . Considers rolling average of gradient, and momentum
Shuffling the Training Data

Taught by

Graham Neubig

Reviews

Start your review of Neural Nets for NLP 2020 - Language Modeling, Efficiency/Training Tricks

Taught by

Neural Nets for NLP 2021 - Language Modeling, Efficiency/Training Tricks

Neural Nets for NLP - Debugging Neural Nets

Neural Nets for NLP: Recurrent Neural Networks

Debugging Neural Nets for NLP

Neural Nets for NLP - Efficiency Tricks for Neural Nets

Neural Nets for NLP - Debugging Neural Nets for NLP

Never Stop Learning.