Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained

Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained

StatQuest with Josh Starmer via YouTube Direct link

Masked Self-Attention, an Autoregressive method

5 of 12

5 of 12

Masked Self-Attention, an Autoregressive method

Class Central Classrooms beta

YouTube playlists curated by Class Central.

Classroom Contents

Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Transformers are taking over AI right now, and quite possibly their most famous use is in ChatGPT. ChatGPT uses a specific type of Transformer called a Decoder-Only Transformer, and this StatQuest sh…
  2. 2 Awesome song and introduction
  3. 3 Word Embedding
  4. 4 Position Encoding
  5. 5 Masked Self-Attention, an Autoregressive method
  6. 6 Residual Connections
  7. 7 Generating the next word in the prompt
  8. 8 Review of encoding and generating the prompt
  9. 9 Generating the output, Part 1
  10. 10 Masked Self-Attention while generating the output
  11. 11 Generating the output, Part 2
  12. 12 Normal Transformers vs Decoder-Only Transformers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.