Fine-tuning Vision Language Models with PaLiGemma-3B - A Practical Guide

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Learn to fine-tune Vision Language Models (VLM) on custom datasets through a 14-minute tutorial video that demonstrates practical implementations using free computational resources. Explore PyTorch notebooks and advanced JAX FLAX notebooks with 8 TPU parallel processing capabilities for fine-tuning both Language Models (LLMs) and Vision Language Models using KERAS 3. Access complete code examples and discover recommendations for leveraging free compute infrastructures across platforms like Google COLAB, Vertex AI, Model Garden, and Kaggle. Follow along with PaliGemma fine-tuning examples, explore PEFT fine-tuning techniques for Gemma Models with full model parallelism, and dive into advanced JAX and Flax inference implementations that run on free Google T4 TPUs. Gain hands-on experience with distributed training techniques while working with cutting-edge vision-language models.