Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

QLoRA: Efficient Training of Large Language Models Using Quantization and Low-Rank Adaptation

AI Bites via YouTube

Overview

Explore a 12-minute technical video that breaks down the groundbreaking QLoRA approach for training Large Language Models (LLMs) on a single GPU through three key innovations: NormalFloat data type, Double Quantization, and Paged Optimizers. Learn how these components work together, starting with fundamental quantization concepts and their limitations, progressing through blockwise quantization techniques, and understanding the implementation of QLoRA finetuning. Compare LoRA and QLoRA approaches while examining practical results and performance metrics. Delivered by a seasoned Machine Learning Researcher with 15 years of software engineering experience, the presentation includes detailed timestamps for easy navigation through topics and features clear technical explanations supported by visual animations created with Manim.

Syllabus

- QLoRA
- Quantization
- Problem with Quantization
- Blockwise Quantization
- Normal Float
- Double Quantization
- Paged Optimizers
- QLoRA Finetuning
- LoRA vs QLoRA
- Results

Taught by

AI Bites

Reviews

Start your review of QLoRA: Efficient Training of Large Language Models Using Quantization and Low-Rank Adaptation

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.