Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Shaw Talebi via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore three methods for compressing Large Language Models (LLMs) - Quantization, Pruning, and Knowledge Distillation/Model Distillation - with accompanying Python code examples. Learn about the challenges of model size and the benefits of compression techniques. Follow along with a practical demonstration of combining Knowledge Distillation and Quantization to compress a BERT-based phishing classifier model. Access additional resources including a blog post, GitHub repository, pre-trained models, and dataset for further exploration of LLM compression techniques.

Syllabus

Intro -
"Bigger is Better" -
The Problem -
Model Compression -
1 Quantization -
2 Pruning -
3 Knowledge Distillation -
Example: Compressing a model with KD + Quantization -

Taught by

Shaw Talebi

Reviews

Start your review of Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.