TILE-MPQ: Design Space Exploration of Tightly Integrated Layer-Wise Mixed-Precision Quantized Units for TinyML Inference
EDGE AI FOUNDATION via YouTube
Overview
Watch a technical conference talk from tinyML Asia 2022 exploring the design space of Tightly Integrated Layer-WisE Mixed-Precision Quantized Units for TinyML inference. Learn about a novel mixed precision quantization (MPQ) searching algorithm that samples layer-wise sensitivity using metrics incorporating both accuracy and hardware costs. Discover how this approach achieves 3-11% higher inference accuracy compared to existing MPQ strategies while maintaining similar hardware costs. Explore the proposed processing-in-memory architecture that integrates optimal MPQ policies through Instruction Set Architecture and micro-architecture co-design. Understand the methodology behind analyzing single layer sensitivity to narrow down the search space and the adaptation of MPQ schemes at the hardware level for seamless processing of both AI and non-AI tasks.
Syllabus
tinyML Asia 2022 Xiaotian Zhao: TILE-MPQ: Design Space Exploration of Tightly Integrated...
Taught by
EDGE AI FOUNDATION