8-bit Methods for Efficient Deep Learning
Center for Language & Speech Processing(CLSP), JHU via YouTube
Overview
Syllabus
Intro
How does quantization work?
Quantization as a mapping
Quantization Example: A non-standard 2-bit data ty
Floating point data types (FP8)
Dynamic exponent quantization
Motivation: Optimizers take up a lot of memory!
What do outliers in quantization look like?
Block-wise quantization
Putting it together: 8-bit optimizers
Using OPT-175B on a single machine via 8-bit weig
The problem with quantizing outliers with large valu
Emergent features: sudden vs. smooth emergence
Mixed precision decomposition
Bit-level scaling laws experimental setup overview
What does help to improve scaling? Data types
Nested Quantization
Instruction Tuning with 4-bit + Adapters
4-bit Normal Float (NF4)
Taught by
Center for Language & Speech Processing(CLSP), JHU