Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning

Sam Witteveen via YouTube

Overview

Explore Google's Vision Language Model PaliGemma in this informative video tutorial. Learn about the model's architecture, capabilities, and applications through a comprehensive overview of PaLI-3 and SigLIP papers. Discover the three pre-trained checkpoints, various sizes, and releases of PaliGemma. Gain hands-on experience with a Hugging Face Spaces demo and explore ScreenAI datasets. Dive into practical coding sessions, focusing on using PaliGemma with Transformers and fine-tuning techniques. Access provided resources, including Colab notebooks for inference and fine-tuning, to enhance your understanding and implementation of this powerful vision language model.

Syllabus

Intro
What is PaliGemma?
PaLI-3 Paper
SigLIP Paper
Hugging Face Blog: PaliGemma
PaliGemma: Three Pre-trained Checkpoints
PaliGemma different Sizes and Releases
PaliGemma Hugging Face Spaces Demo
ScreenAI Datasets
Code Time
Using PaliGemma with Transformers
PaliGemma Finetuning

Taught by

Sam Witteveen

Reviews

Start your review of Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.