A Review of Compression Methods for Deep Convolutional Neural Networks

Overview

Explore a comprehensive review of compression methods for deep convolutional neural networks in this 58-minute tinyML Talks webcast. Delve into Professor Vincent Gripon's expertise as he discusses various techniques to compress and accelerate CNNs, including pruning, distillation, clustering, and quantization. Gain insights into the pros and cons of each method, understanding how to reduce the size of CNN architectures for tinyML devices without compromising accuracy. Learn about deep learning challenges, data centers, layers, datasets, architectures, and the number of operations in convolutional layers. Examine quantization experiments, flops rates, and energy consumption. Participate in a Q&A session to further explore the topic and clarify any questions about deploying CNNs on resource-constrained devices.

Syllabus

Introduction
Outline
Deep Learning
Problems with Deep Learning
Data Centers and Deep Learning
Layers
Data Sets
Questions
Architectures
Number of operations
Convolutional layers
Comparison of architectures
Comparing architectures
Retraining feature maps
Quantizing parameters
Quantization experiment
Flops rate
Compensation during training
Quantization during training
Quantization for precision
Results
Shift Attention Layers
Clustering weights
Energy consumption
Summary
Conclusion
Questions and Recap

Taught by

tinyML

Reviews

Start your review of A Review of Compression Methods for Deep Convolutional Neural Networks

Taught by

Introduction to Optimization Algorithms to Compress Neural Networks

Advanced Generative Adversarial Networks (GANs)

Deep Learning for Mobile Devices

TinyML and Efficient Deep Learning Computing - Course Summary

TinyML and Efficient Deep Learning Computing - Lecture 24: Course Summary

Convolutional Neural Networks

Never Stop Learning.