Structured Quantization for Neural Network Language Model Compression

Structured Quantization for Neural Network Language Model Compression

tinyML via YouTube Direct link

Speed recognition performance

13 of 21

13 of 21

Speed recognition performance

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Structured Quantization for Neural Network Language Model Compression

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Introduction
  2. 2 Neural network vs NLP
  3. 3 Language model
  4. 4 Memory
  5. 5 Neural Network
  6. 6 Word Embedding
  7. 7 Neural Network Size
  8. 8 General Approach
  9. 9 Pruning
  10. 10 Quantization based approaches
  11. 11 Fixed point quantization
  12. 12 Product quantization
  13. 13 Speed recognition performance
  14. 14 Binarization
  15. 15 Embedding Matrix
  16. 16 Full Precision Model
  17. 17 Two Methods
  18. 18 Results
  19. 19 Conclusion
  20. 20 Question
  21. 21 Sponsors

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.