Overview
Explore advanced network quantization and compression techniques in this comprehensive tutorial from the tinyML Summit 2021. Dive into the challenges of implementing AI on end devices with limited power and thermal budgets. Learn about Qualcomm's research in novel quantization and compression methods to overcome these obstacles. Discover how to implement these techniques using the AI Model Efficiency Toolkit (AIMET). Gain insights into existing challenges, Qualcomm's innovative solutions, and the practical application of AIMET features. Understand the importance of quantization, various quantization models, and post-training techniques. Explore concepts such as Data Free Quantization and Cross Layer Equalization, and examine their performance results. Perfect for developers and researchers looking to optimize AI models for resource-constrained environments.
Syllabus
Introduction
Welcome
Challenges
Qualcomms research
AIMET overview
AIMET features
AIMET quantization library
GitHub
Snapchat
Quantization performance
QA
RNN support
Presentation
Why quantize
Quantization model
Posttraining techniques
Questions
Data Free Quantization
Cross Layer Equalization
Results
Taught by
tinyML