Full Fine-Tuning vs LoRA and QLoRA - Comparison and Best Practices

Overview

Explore the differences between full fine-tuning and (Q)LoRA techniques in this comprehensive 53-minute video from Trelis Research. Learn about VRAM requirements, training time, and quality comparisons for various fine-tuning methods. Dive into the mechanics of full fine-tuning, LoRA, and QLoRA, and discover how to select optimal learning rates, ranks, and alpha values. Gain insights on hyperparameter selection for Mistral 7B fine-tuning, along with specific tips for QLoRA, regularization, and adapter merging. Explore the benefits of Unsloth and LoftQ for LoRA-aware quantization. Follow a step-by-step guide for TinyLlama QLoRA implementation and compare Mistral 7B fine-tuning results across different methods.

Syllabus

Comparing full fine-tuning and LoRA fine tuning
Video Overview
Comparing VRAM, Training Time + Quality
How full fine-tuning works
How LoRA works
How QLoRA works
How to choose learning rate, rank and alpha
Choosing hyper parameters for Mistral 7B fine-tuning
Specific tips for QLoRA, regularization and adapter merging.
Tips for using Unsloth
LoftQ - LoRA aware quantisation
Step by step TinyLlama QLoRA
Mistral 7B Fine-tuning Results Comparison
Wrap up