Overview
Explore performance optimizations for Edge AI applications in this tinyML Talks local webcast featuring Felix Johnny and Fredrik Knutsson from Arm. Dive into identifying bottlenecks in ML model inference and learn effective solutions, focusing on the CMSIS-NN library for Arm Cortex-M processors. Discover common optimization methodologies, understand how operator shapes affect performance, and gain insights for model designers. Witness a live demonstration of CMSIS-NN with TensorFlow Lite for Microcontrollers on an Arduino Nano 33 BLE sense board, showcasing the benefits of optimization techniques. Cover topics such as cycle measurements, memory bound analysis, library optimizations, hyperparameter optimization, and person detection on embedded hardware.
Syllabus
Introduction
Presentation
Optimizations
Cycle Measurements
Cycle Bound Analysis
Memory Bound Analysis
Library Optimizations
Fully Connected Operators
Loop unrolling
Checking compiler output
Cycle bound improvement
Simplifications
Summary
Hyperparameter Optimization
Question for Felix
TensorFlow Lite for microcontroller
Merging CMSISNN and optimized kernels
Reference kernels
Optimization
Person Detection
Hardware
Demo
Questions
Taught by
tinyML