Recurrent Neural Networks, Vanilla and Gated - LSTM

Overview

Explore recurrent neural networks, including vanilla and gated (LSTM) architectures, in this comprehensive lecture. Dive into various sequence processing techniques such as vector-to-sequence, sequence-to-vector, and sequence-to-sequence models. Learn about backpropagation through time, language modeling, and the challenges of vanishing and exploding gradients. Discover the Long Short-Term Memory (LSTM) architecture and its gating mechanism. Gain hands-on experience with a practical demonstration using Jupyter Notebook and PyTorch for sequence classification. Understand how to summarize research papers effectively and grasp the importance of higher hidden dimensions in neural networks.

Syllabus

– Good morning
– How to summarise papers as @y0b1byte with Notion
– Why do we need to go to a higher hidden dimension?
– Today class: recurrent neural nets
– Vector to sequence vec2seq
– Sequence to vector seq2vec
– Sequence to vector to sequence seq2vec2seq
– Sequence to sequence seq2seq
– Training a recurrent network: back propagation through time
– Training example: language model
– Vanishing & exploding gradients and gating mechanism
– The Long Short-Term Memory LSTM
– Jupyter Notebook and PyTorch in action: sequence classification
– Inspecting the activation values
– Closing remarks