Neural Nets for NLP 2021 - Recurrent Neural Networks

Overview

Explore recurrent neural networks for natural language processing in this comprehensive lecture from CMU's Neural Networks for NLP course. Dive into topics including bi-directional recurrent networks, vanishing gradients, LSTMs, and the strengths and weaknesses of recurrence in sentence modeling. Learn about parameter tying, mini-batching techniques, and optimized LSTM implementations. Discover how to handle long-distance dependencies and long sequences in language processing tasks. Gain insights into pre-training methods for RNNs and explore advanced concepts like gated recurrent units and soft hierarchical structures.

Syllabus

Intro
NLP and Sequential Data
Long-distance Dependencies in Language
Can be Complicated!
Recurrent Neural Networks (Elman 1990)
Training RNNS
Parameter Tying
What Can RNNs Do?
Representing Sentences
e.g. Language Modeling
Vanishing Gradient . Gradients decrease as they get pushed back
A Solution: Long Short-term Memory (Hochreiter and Schmidhuber 1997)
LSTM Structure
What can LSTMs Learn? (1)
Handling Mini-batching
Mini-batching Method
Bucketing/Sorting
Optimized Implementations of LSTMs (Appleyard 2015)
Gated Recurrent Units (Cho et al. 2014)
Soft Hierarchical Stucture
Handling Long Sequences