Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained
StatQuest with Josh Starmer via YouTube
Overview
Syllabus
Transformers are taking over AI right now, and quite possibly their most famous use is in ChatGPT. ChatGPT uses a specific type of Transformer called a Decoder-Only Transformer, and this StatQuest shows you how they work, one step at a time. And at the end at , we talk about the differences between a Normal Transformer and a Decoder-Only Transformer. BAM!
Awesome song and introduction
Word Embedding
Position Encoding
Masked Self-Attention, an Autoregressive method
Residual Connections
Generating the next word in the prompt
Review of encoding and generating the prompt
Generating the output, Part 1
Masked Self-Attention while generating the output
Generating the output, Part 2
Normal Transformers vs Decoder-Only Transformers
Taught by
StatQuest with Josh Starmer