Neural Nets for NLP 2017 - Attention

Neural Nets for NLP 2017 - Attention

Graham Neubig via YouTube Direct link

Multi-headed Attention

20 of 22

20 of 22

Multi-headed Attention

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Neural Nets for NLP 2017 - Attention

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Sentence Representations
  3. 3 Basic Idea (Bahdanau et al. 2015)
  4. 4 Calculating Attention (1)
  5. 5 A Graphical Example
  6. 6 Attention Score Functions (2)
  7. 7 Input Sentence
  8. 8 Previously Generated Things
  9. 9 Various Modalities
  10. 10 Hierarchical Structures (Yang et al. 2016)
  11. 11 Multiple Sources
  12. 12 Intra-Attention / Self Attention (Cheng et al. 2016) • Each element in the sentence attends to other elements + context sensitive encodings!
  13. 13 Coverage
  14. 14 Incorporating Markov Properties (Cohn et al. 2015)
  15. 15 Bidirectional Training (Cohn et al. 2015)
  16. 16 Supervised Training (Mi et al. 2016)
  17. 17 Attention is not Alignment! (Koehn and Knowles 2017) • Attention is often blurred
  18. 18 Monotonic Attention (e.g. Yu et al. 2016)
  19. 19 Convolutional Attention (Allamanis et al. 2016)
  20. 20 Multi-headed Attention
  21. 21 Summary of the "Transformer" (Vaswani et al. 2017)
  22. 22 Attention Tricks

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.