Self - Cross, Hard - Soft Attention and the Transformer

Self - Cross, Hard - Soft Attention and the Transformer

Alfredo Canziani via YouTube Direct link

– The Transformer “decoder” which is an encoder-predictor-decoder module

14 of 16

14 of 16

– The Transformer “decoder” which is an encoder-predictor-decoder module

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Self - Cross, Hard - Soft Attention and the Transformer

Automatically move to the next video in the Classroom when playback concludes

  1. 1 – Welcome to class
  2. 2 – Listening to YouTube from the terminal
  3. 3 – Summarising papers with @Notion
  4. 4 – Reading papers collaboratively
  5. 5 – Attention! Self / cross, hard / soft
  6. 6 – Use cases: set encoding!
  7. 7 – Self-attention
  8. 8 – Key-value store
  9. 9 – Queries, keys, and values → self-attention
  10. 10 – Queries, keys, and values → cross-attention
  11. 11 – Implementation details
  12. 12 – The Transformer: an encoder-predictor-decoder architecture
  13. 13 – The Transformer encoder
  14. 14 – The Transformer “decoder” which is an encoder-predictor-decoder module
  15. 15 – Jupyter Notebook and PyTorch implementation of a Transformer encoder
  16. 16 – Goodbye :

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.