Overview
Explore the cutting-edge world of second-order pruning algorithms for state-of-the-art model compression in this 42-minute video presentation by Eldar Kurtić, Research Consultant at Neural Magic. Dive into the research, production results, and intuition behind these powerful techniques that enable higher sparsity while maintaining accuracy. Learn how to achieve significant model size reduction, lower latency, and higher throughput by removing weights that least affect the loss function. Discover real-world examples, such as pruning a ResNet-50 image classification model by 95% while retaining 99% of its baseline accuracy, resulting in a dramatic file size reduction from 90.8MB to 9.3MB. Follow along as the speaker guides you through the Optimal Brain Damage Framework, OBS Framework, and Efficient Second-Order Approximation. Gain insights into applications in Classification, Natural Language Processing, and Deep Space Deployment. Understand the intuition behind weight updates and explore open-source implementations. Conclude with a Q&A session to address any questions about applying these advanced pruning algorithms to your machine learning projects.
Syllabus
Introduction
Agenda
What is Pruning
Classification
Observations
Timeline
Optimal Brain Damage Framework
OBS Framework
Efficient SecondOrder Approximation
Results
Natural Language Processing
Deep Space Deployment
Next Step Up Results
Intuition
Weight Update
Open Source
QA Session
Taught by
Neural Magic