Overview
Explore client-side deep learning optimization techniques using PyTorch in this 35-minute conference talk from Strange Loop 2021. Dive into the challenges and solutions for implementing real-time computer vision models on mobile devices, focusing on overcoming network constraints and reducing developer costs. Learn about porting custom architectures, serializing models from Python to binary assets, and addressing hardware compatibility issues. Discover the theory and practice of model quantization, fusion, and efficient tensor storage. Gain insights into benchmarking client-side model performance across various devices and operating systems. Presented by Tyler Kirby, Principal Data Scientist at UniGroup, and Shane Caldwell, Director of Artificial Intelligence at UniGroup, this talk covers topics such as eager execution, scripting, tracing, intermediate representations, running models in C++, quantization techniques, and experimental results.
Syllabus
Intro
Who are we?
Why PyTorch?
Advantages of Eager Execution
Optimization necessitates looking under the hood
Axes of Optimization
Production Considerations
Scripting Handes control flow and other arbitrary
Scripting + Tracing
Intermediate Representations in Pytorch
Running in C++
Speed tips
Running Arbitrary Models
Lite Interpreter
What is Quantization?
Quantization in PyTorch
Eager Mode Quantization
Dynamk Quantization
Quantized Aware Training
Experimental Results
Channel Last Format
Addendum
Taught by
Strange Loop Conference