Completed
Side-note - IDEFICS 2 vision to text adapter architecture
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Fine-tuning Multi-modal Video and Text Models
Automatically move to the next video in the Classroom when playback concludes
- 1 "Video + Text" from "Image + Text" models
- 2 Clipping and Querying Videos with an IDEFICS 2 endpoint
- 3 Fine-tuning video + text models
- 4 Dataset generation for video fine-tuning + pushing to hub
- 5 Clipping and querying videos with image splitting in a Jupyter Notebook
- 6 Side-note - IDEFICS 2 vision to text adapter architecture
- 7 Video clip notebook evaluation - continued
- 8 Loading a video dataset for fine-tuning
- 9 Recap of video + text model fine-tuning