Fine-tuning Florence-2: Microsoft's Multimodal Model for Custom Object Detection

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Unlock the power of Microsoft's Florence-2, a cutting-edge open-source Vision Language Model, for custom object detection tasks in this comprehensive 26-minute tutorial. Dive into the process of fine-tuning Florence-2 using Google Colab, from setting up your environment to preparing datasets and optimizing the model with LoRA. Explore the pre-trained capabilities of Florence-2, master PyTorch data loading techniques, and learn how to unleash its potential for custom object detection. Evaluate your fine-tuned model's performance and compare Florence-2 with other computer vision models. Gain access to valuable resources, including GitHub notebooks, blog posts, and a Hugging Face Space for hands-on practice. Join the upcoming community session to further enhance your skills and stay updated with the latest developments in the field of computer vision and machine learning.

Syllabus

- Introduction: Unlock the Power of Florence-2
- Getting Started: Prepare for VLM Fine-Tuning
- Florence-2 in Action: Explore Pre-trained Capabilities
- Dataset Deep Dive: PyTorch Data Loading for Florence-2
- LoRA: Optimize Your VLM Training
- Fine-Tuning: Unleash Florence-2's Custom Object Detection
- Model Evaluation: Measure Your VLM's Success
- Florence-2 vs Other Computer Vision Models
- Conclusion and Next Steps
- Community Session July 3th, 2024 at 08:00 AM PST / 11:00 AM EST / PM CET: https://roboflow.stream