Overview
Dive into an in-depth technical session on the TensorFlow Model Optimization Toolkit, focusing on quantization and pruning techniques. Explore the challenges of quantization, learn about different approaches including quantization during and after training, and understand the benefits of integer and hybrid quantization. Discover pruning tools, their implementation, and practical examples. Gain insights into quantization kernels, specs, and APIs, as well as the accuracy and performance benefits of these optimization techniques. Enhance your understanding of matrix multiplication in the context of model optimization as presented by TensorFlow Software Engineer Suharsh Sivakumar in this 43-minute technical deep dive.
Syllabus
Introduction
Overview
Why does this matter
Quantization is hard
Quantization during training
Quantization after training
Pruning
Quantization kernels
Quantitation spec
Cemetry
Perchannel condensation
Quantization tools
Integer Quantization
Hybrid Quantization
Postreading Integer Quantization
Quantization Accuracy
Quantization Benefits
Summary
Pruning Tools
Quantization API
Pruning Example
Pruning Summary
Matrix Multiplication
Taught by
TensorFlow