Decoder Flow in Transformer Model

Overview

Dive into a comprehensive 40-minute video tutorial that breaks down the process of coding a Transformer Decoder from scratch. Learn about the key components of the Transformer architecture, including parameter setup, input/output handling, masking techniques, and the intricacies of the decoder's forward pass. Explore essential concepts such as Masked Multi-Head Self-Attention, Layer Normalization, Multi-Head Cross Attention, and Feed Forward networks. Follow along as the tutorial guides you through instantiating the decoder, implementing decoder layers, and ultimately completing the entire decoder flow. Gain valuable insights into the inner workings of this powerful neural network architecture, perfect for those looking to deepen their understanding of natural language processing and machine learning.

Syllabus

Introduction
Parameters of Transformer
Inputs and Outputs of Transformer
Masking
Instantiating Decoder
Decoder Forward Pass
Decoder Layer
Masked Multi Head Self Attention
Dropout + Layer Normalization
Multi Head Cross Attention
Feed Forward, Activation
Completing the decoder flow