Quantization in Neural Networks - Lecture 5

Overview

Dive into the world of neural network quantization in this comprehensive lecture from MIT's TinyML and Efficient Deep Learning Computing course. Explore numeric data types in modern computing systems and gain insights into K-means-based quantization and linear quantization techniques. Learn how to optimize deep learning models for resource-constrained devices, enabling powerful AI applications on mobile and IoT platforms. Discover strategies for efficient inference, including model compression, pruning, and neural architecture search. Gain hands-on experience implementing deep learning applications on microcontrollers, mobile phones, and quantum machines through an open-ended design project focused on mobile AI.

Syllabus

Lecture 05 - Quantization (Part I) | MIT 6.S965

Taught by

MIT HAN Lab

Reviews

Start your review of Quantization in Neural Networks - Lecture 5

Taught by

Neural Architecture Search for Efficient Deep Learning - Lecture 9

TinyEngine - Efficient Training and Inference on Microcontrollers - Lecture 17

TinyEngine - Efficient Training and Inference on Microcontrollers - Lecture 17

MCUNet: Tiny Neural Network Design for Microcontrollers - Lecture 11

MCUNet: Tiny Neural Network Design for Microcontrollers - Lecture 11

TinyML and Efficient Deep Learning Computing - Lecture 24: Course Summary

Never Stop Learning.