TILE-MPQ: Design Space Exploration of Tightly Integrated Layer-Wise Mixed-Precision Quantized Units for TinyML Inference

Overview

Watch a technical conference talk from tinyML Asia 2022 exploring the design space of Tightly Integrated Layer-WisE Mixed-Precision Quantized Units for TinyML inference. Learn about a novel mixed precision quantization (MPQ) searching algorithm that samples layer-wise sensitivity using metrics incorporating both accuracy and hardware costs. Discover how this approach achieves 3-11% higher inference accuracy compared to existing MPQ strategies while maintaining similar hardware costs. Explore the proposed processing-in-memory architecture that integrates optimal MPQ policies through Instruction Set Architecture and micro-architecture co-design. Understand the methodology behind analyzing single layer sensitivity to narrow down the search space and the adaptation of MPQ schemes at the hardware level for seamless processing of both AI and non-AI tasks.

Syllabus

tinyML Asia 2022 Xiaotian Zhao: TILE-MPQ: Design Space Exploration of Tightly Integrated...

Taught by

EDGE AI FOUNDATION

Reviews

Start your review of TILE-MPQ: Design Space Exploration of Tightly Integrated Layer-Wise Mixed-Precision Quantized Units for TinyML Inference

Taught by

The Model Efficiency Pipeline: Enabling Deep Learning Inference at the Edge

Hardware-Aware Model Optimization in Arm Ethos-U65 NPU

TinyML Talks Pakistan - SuperSlash - Unifying Design Space Exploration and Model Compression

Using AI to Design Energy-Efficient AI Accelerators for the Edge

Software-Driven TinyML Hardware Co-Design for AI Power Smart Devices

10 Best Deep Learning Courses for 2024

Never Stop Learning.