Completed
Class-based Softmax (Goodman 2001) • Assign each word to a class
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Neural Nets for NLP - Efficiency Tricks for Neural Nets
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Why are Neural Networks Slow and What Can we Do?
- 3 A Simple Example • How long does a metro-matrix multiply take?
- 4 Practically
- 5 Speed Trick 3
- 6 Reduce # of Operations
- 7 Reduce CPU-GPU Data Movement
- 8 What About Memory?
- 9 Three Types of Parallelism
- 10 Within-operation Parallelism
- 11 Operation-wise Parallelism
- 12 Example-wise Parallelism
- 13 Computation Across Large Vocabularies
- 14 A Visual Example of the Softmax
- 15 Importance Sampling (Bengio and Senecal 2003)
- 16 Noise Contrastive Estimation (Mnih & Teh 2012)
- 17 Mini-batch Based Negative Sampling
- 18 Hard Negative Mining • Select the top n hardest examples
- 19 Efficient Maximum Inner Product Search
- 20 Structure-based Approximations
- 21 Class-based Softmax (Goodman 2001) • Assign each word to a class
- 22 Binary Code Prediction (Dietterich and Bakiri 1995, Oda et al. 2017)
- 23 Two Improvement to Binary Code Prediction