Unlock Faster and More Efficient LLMs with SparseGPT - Neural Magic

Overview

Explore SparseGPT, a groundbreaking machine learning model optimization technique, in this 42-minute video presentation by Neural Magic. Learn how to prune and quantize large language models (LLMs) in a single step, enabling deployment on standard CPUs at GPU-like speeds. Gain insights into the mathematics of compression, one-shot compression of GPT models, and the combination of sparsity and quantization. Discover how SparseGPT transforms the Pareto frontier, making it possible for anyone to run and sparsify LLMs. Examine deployment benchmarks for LLMs on CPU hardware and understand how software optimization can outperform hardware solutions. Delve into topics such as the neural network pruning problem, experimental validation, and the DeepSparse exploitation technique. Conclude with a Q&A session to further enhance your understanding of this innovative approach to faster and more efficient LLMs.

Syllabus

Intro
Massive Deep Models are Great
The Neural Network Pruning Problem
The Mathematics of Compression
One-Shot Compression of GPT Models
The General Approach
Our Approach: Quantization Version
Experimental Validation
Combining Sparsity and Quantization
Exploiting with DeepSparse
Software Beats Hardware (continued)
Transforming the Pareto Frontier
Enabling Anyone to Run
Enabling Anyone to Sparsify
Questions

Taught by

Neural Magic

Reviews

Start your review of Unlock Faster and More Efficient LLMs with SparseGPT - Neural Magic

Taught by

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Structured Quantization for Neural Network Language Model Compression

Pruning and Quantizing ML Models With One Shot Without Retraining

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

AWQ for LLM Quantization - Efficient Inference Framework for Large Language Models

Deploy LLMs More Efficiently with vLLM and Neural Magic

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.