Optimizing LLM Fine-Tuning with PEFT and LoRA Adapter-Tuning for GPU Performance
Discover AI via YouTube
Overview
Syllabus
PEFT source code LoRA, pre-fix tuning,..
Llama - LoRA fine-tuning code
Create PEFT - LoRA Model Seq2Seq
Trainable parameters of PEFT - LoRA model
get_peft_model
PEFT - LoRA - 8bit model of OPT 6.7B LLM
load_in_8bit
INT8 Quantization explained
Fine-tune a quantized model
bfloat16 and XLA compiler PyTorch 2.0
Freeze all pre-trained layer weight tensors
Adapter-tuning of PEFT - LoRA model
Save tuned PEFT - LoRA Adapter weights
Run inference of new PEFT - LoRA adapter - tuned LLM
Load your Adapter-tuned PEFT - LoRA model
Taught by
Discover AI