Overview
Explore recurrent neural networks (RNNs) in natural language processing through this comprehensive lecture from Carnegie Mellon University's Neural Networks for NLP course. Delve into the intricacies of sequential data processing, long-distance dependencies in language, and the unrolling of RNNs in time. Learn about training techniques, parameter tying, and the various applications of RNNs in representing sentences and contexts. Examine the structure of Long Short-Term Memory (LSTM) networks and other alternatives. Gain insights into practical implementation aspects such as mini-batching, handling long sequences, and the strengths and weaknesses of RNNs. Conclude with an overview of pre-training and transfer learning techniques in the context of recurrent neural networks for NLP tasks.
Syllabus
Intro
NLP and Sequential Data
Long-distance Dependencies in Language
Can be Complicated!
Unrolling in Time
Training RNNS
Parameter Tying
What Can RNNs Do?
Representing Sentences
Representing Contexts
e.g. Language Modeling
RNNLM Example: Loss Calculation and State Update
LSTM Structure
Other Alternatives
Handling Mini-batching
Mini-batching Method
Bucketing/Sorting
Handling Long Sequences
RNN Strengths/Weaknesses
Pre-training/Transfer
Taught by
Graham Neubig