Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive review of compression methods for deep convolutional neural networks in this 58-minute tinyML Talks webcast. Delve into Professor Vincent Gripon's expertise as he discusses various techniques to compress and accelerate CNNs, including pruning, distillation, clustering, and quantization. Gain insights into the pros and cons of each method, understanding how to reduce the size of CNN architectures for tinyML devices without compromising accuracy. Learn about deep learning challenges, data centers, layers, datasets, architectures, and the number of operations in convolutional layers. Examine quantization experiments, flops rates, and energy consumption. Participate in a Q&A session to further explore the topic and clarify any questions about deploying CNNs on resource-constrained devices.
Syllabus
Introduction
Outline
Deep Learning
Problems with Deep Learning
Data Centers and Deep Learning
Layers
Data Sets
Questions
Architectures
Number of operations
Convolutional layers
Comparison of architectures
Comparing architectures
Retraining feature maps
Quantizing parameters
Quantization experiment
Flops rate
Compensation during training
Quantization during training
Quantization for precision
Results
Shift Attention Layers
Clustering weights
Energy consumption
Summary
Conclusion
Questions and Recap
Taught by
tinyML