8-bit Methods for Efficient Deep Learning

8-bit Methods for Efficient Deep Learning

Center for Language & Speech Processing(CLSP), JHU via YouTube Direct link

How does quantization work?

2 of 19

2 of 19

How does quantization work?

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

8-bit Methods for Efficient Deep Learning

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 How does quantization work?
  3. 3 Quantization as a mapping
  4. 4 Quantization Example: A non-standard 2-bit data ty
  5. 5 Floating point data types (FP8)
  6. 6 Dynamic exponent quantization
  7. 7 Motivation: Optimizers take up a lot of memory!
  8. 8 What do outliers in quantization look like?
  9. 9 Block-wise quantization
  10. 10 Putting it together: 8-bit optimizers
  11. 11 Using OPT-175B on a single machine via 8-bit weig
  12. 12 The problem with quantizing outliers with large valu
  13. 13 Emergent features: sudden vs. smooth emergence
  14. 14 Mixed precision decomposition
  15. 15 Bit-level scaling laws experimental setup overview
  16. 16 What does help to improve scaling? Data types
  17. 17 Nested Quantization
  18. 18 Instruction Tuning with 4-bit + Adapters
  19. 19 4-bit Normal Float (NF4)

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.