Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

GGUF Quantization of Large Language Models Using LLAMA.cpp

AI Bites via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to quantize Large Language Models (LLMs) using LLAMA.cpp in this 12-minute tutorial video that demonstrates running these models efficiently on laptops and small devices without requiring GPUs. Follow along with a practical demonstration of quantizing a fine-tuned Gemma 2 Billion parameter model on a Macbook, with steps applicable to any fine-tuned LLM. Master the installation process of LLAMA.cpp, an open-source C/C++ library, understand the preliminaries of model quantization, and discover how to push LLMs to the HuggingFace Hub. Gain insights from an experienced Machine Learning researcher with 15 years of software engineering background who guides you through the complete process from introduction to conclusion.

Syllabus

Introduction
Push LLM to HuggingFace Hub
LLAMAcpp
LLAMAcpp installation
Preliminaries
Quantization
Conclusion

Taught by

AI Bites

Reviews

Start your review of GGUF Quantization of Large Language Models Using LLAMA.cpp

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.