Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Understanding 4-bit Quantization and QLoRA - Memory Efficient Fine-tuning of LLMs

Discover AI via YouTube

Overview

Learn about QLoRA 4-bit quantization for memory-efficient fine-tuning of Large Language Models through a detailed 42-minute video tutorial that covers both theoretical concepts and practical implementation. Explore Parameter Efficient Fine-Tuning (PEFT) methods, with a specific focus on how 4-bit quantization works in QLoRA. Follow along with a hands-on demonstration using Google Colab to fine-tune a FALCON 7B model using QLoRA 4-bit quantization and Transformer Reinforcement Learning (TRL). Gain insights into Huggingface Accelerate's support for 4-bit QLoRA LLM models and access practical code examples for implementation. Build upon foundational knowledge of LoRA and other PEFT methods while mastering advanced techniques for optimizing large language models.

Syllabus

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

Taught by

Discover AI

Reviews

Start your review of Understanding 4-bit Quantization and QLoRA - Memory Efficient Fine-tuning of LLMs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.