Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Multi-GPU Fine-tuning with DDP and FSDP

Trelis Research via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into the world of multi-GPU fine-tuning with this comprehensive tutorial on Distributed Data Parallel (DDP) and Fully Sharded Data Parallel (FSDP) techniques. Learn how to optimize VRAM usage, understand the intricacies of the Adam optimizer, and explore the trade-offs between various distributed training methods. Gain practical insights on choosing the right GPU setup, implementing LoRA and quantization for VRAM reduction, and utilizing tools like DeepSpeed and Accelerate. Follow along with code examples for Model Parallel, DDP, and FSDP implementations, and discover how to set up and use rented GPUs via SSH. By the end of this tutorial, you'll be equipped with the knowledge to efficiently fine-tune large language models across multiple GPUs.

Syllabus

Multi-GPU Distributed Training
Video Overview
Choosing a GPU setup
Understanding VRAM requirements in detail
Understanding Optimisation and Gradient Descent
How does the Adam optimizer work?
How the Adam optimiser affects VRAM requirements
Effect of activations, model context and batch size on VRAM
Tip for GPU setup - start with a small batch size
Reducing VRAM with LoRA and quantisation
Quality trade-offs with quantisation and LoRA
Choosing between MP, DDP or FSDP
Distributed Data Parallel
Model Parallel and Fully Sharded Data Parallel FSDP
Trade-offs with DDP and FSDP
How does DeepSpeed compare to FSDP
Using FSDP and DeepSpeed with Accelerate
Code examples for MP, DDP and FSDP
Using SSH with rented GPUs Runpod
Installation
slight detour Setting a username and email for GitHub
Basic Model Parallel MP fine-tuning script
Fine-tuning script with Distributed Data Parallel DDP
Fine-tuning script with Fully Shaded Data Parallel FSDP
Running ‘accelerate config’ for FSDP
Saving a model after FSDP fine-tuning
Quick demo of a complete FSDP LoRA training script
Quick demo of an inference script after training
Wrap up

Taught by

Trelis Research

Reviews

Start your review of Multi-GPU Fine-tuning with DDP and FSDP

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.