Neural Nets for NLP - Structured Prediction with Local Independence Assumptions

Neural Nets for NLP - Structured Prediction with Local Independence Assumptions

Graham Neubig via YouTube Direct link

Models w/ Local Dependencies

7 of 26

7 of 26

Models w/ Local Dependencies

Class Central Classrooms beta

YouTube playlists curated by Class Central.

Classroom Contents

Neural Nets for NLP - Structured Prediction with Local Independence Assumptions

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 Sequence Labeling One tag for one word .e.g. Part of speech tagging hate
  3. 3 Sequence Labeling as Independent Classification hate
  4. 4 Problems
  5. 5 Exposure Bias Teacher Forcing
  6. 6 Label Bias
  7. 7 Models w/ Local Dependencies
  8. 8 Reminder: Globally Normalized Models
  9. 9 Conditional Random Fields General form of globally normalized model
  10. 10 Potential Functions
  11. 11 BILSTM-CRF for Sequence Labeling hate
  12. 12 CRF Training & Decoding
  13. 13 Interactions
  14. 14 Step: Initial Part First, calculate transition from and emission of the first word for every POS
  15. 15 Step: Middle Parts
  16. 16 Forward Step: Final Part • Finish up the sentence with the sentence final symbol
  17. 17 Computing the Partition Function • Hey|X is the partition of sequence with length equal tot and end with label y
  18. 18 Decoding and Gradient Calculation
  19. 19 CNN for Character-level representation • We used CNN to extract morphological information such as prefix or suffix of a word
  20. 20 Training Details
  21. 21 Experiments
  22. 22 Reward Functions in Structured Prediction
  23. 23 Previous Methods to Consider Reward
  24. 24 Minimizing Risk by Enumeration Simple idea: directly calculate the risk of all hypotheses in the space
  25. 25 Enumeration + Sampling (Shen+ 2016) • Enumerating all hypotheses is intractable! . Instead of enumerating over everything, only enumerate over a sample, and re-normalize
  26. 26 Token-wise Minimum Risk If we can come up with a decomposable error function, we can calculate risk for each word

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.