Overview
Syllabus
Intro
In Neural Networks, Tuning is Paramount!
A Typical Situation
Identifying Training Time Problems
Is My Model Too Weak?
Be Careful of Deep Models
Trouble w/ Optimization
Reminder: Optimizers - SGD: take a step in the direction of the gradient
Learning Rate Learning rate is an important parameter
Initialization
Debugging Minibatching
Debugging Decoding
Debugging Search
Look At Your Data!
Quantitative Analysis
Symptoms of Overfitting
Reminder: Early Stopping, Learning Rate Decay
Reminder: Dropout (Srivastava et al. 2014) Neural nets have lots of parameters, and are prone to overfitting • Dropout: randomly zero-out nodes in the hidden layer with probability p at training time only
A Stark Example (Koehn and Knowles 2017) • Better search (=better model score) can result in worse BLEU score!
Managing Loss Function/ Eval Metric Differences Most principled way: use structured prediction techniques to be discussed in future classes
A Simple Method: Early Stopping w/ Eval Metric
Reproducing Previous Work
Taught by
Graham Neubig