Debugging Neural Nets for NLP

Overview

Explore debugging techniques for neural networks in natural language processing during this lecture from CMU's Neural Networks for NLP course. Learn to diagnose problems, address training and decoding time issues, combat overfitting, and handle disconnects between loss and evaluation. Gain insights into model sizing, optimization challenges, initialization strategies, and the impact of data sorting on performance. Discover effective approaches for beam search debugging and implementing dev-driven learning rate decay to enhance your NLP models.

Syllabus

Intro
In Neural Networks, Tuning is Paramount!
A Typical Situation
Possible Causes
Identifying Training Time Problems
Is My Model Too Weak? Your model needs to be big enough to learn . Model size depends on task . For language modeling, at least 512 nodes • For natural language analysis, 128 or so may do . Multiple layers are often better
Be Careful of Deep Models
Trouble w/ Optimization
Reminder: Optimizers
Initialization
Bucketing/Sorting • If we use sentences of different lengths, too much padding and sorting can result in slow training • To remedy this sort sentences so similarly-lengthed sentences are in the same batch • But this can affect performance! (Morishita et al. 2017)
Debugging Decoding
Beam Search
Debugging Search
Look At Your Data!
Symptoms of Overfitting
Reminder: Dev-driven Learning Rate Decay Start w/ a high learning rate, then degrade learning rate when start overfitting the development set (the newbob learning rate schedule)

Taught by

Graham Neubig

Reviews

Start your review of Debugging Neural Nets for NLP

Taught by

Neural Nets for NLP - Debugging Neural Nets

Neural Nets for NLP - Debugging Neural Nets for NLP

Neural Nets for NLP - Debugging Neural Nets

Neural Nets for NLP - Structured Prediction Basics

Neural Nets for NLP 2021 - Recurrent Neural Networks

Neural Nets for NLP 2020: Conditioned Generation

Never Stop Learning.