Completed
Mixed precision decomposition
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
8-bit Methods for Efficient Deep Learning
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 How does quantization work?
- 3 Quantization as a mapping
- 4 Quantization Example: A non-standard 2-bit data ty
- 5 Floating point data types (FP8)
- 6 Dynamic exponent quantization
- 7 Motivation: Optimizers take up a lot of memory!
- 8 What do outliers in quantization look like?
- 9 Block-wise quantization
- 10 Putting it together: 8-bit optimizers
- 11 Using OPT-175B on a single machine via 8-bit weig
- 12 The problem with quantizing outliers with large valu
- 13 Emergent features: sudden vs. smooth emergence
- 14 Mixed precision decomposition
- 15 Bit-level scaling laws experimental setup overview
- 16 What does help to improve scaling? Data types
- 17 Nested Quantization
- 18 Instruction Tuning with 4-bit + Adapters
- 19 4-bit Normal Float (NF4)