Fine-tuning Multi-modal Video and Text Models

Trelis Research via YouTube Direct link

Side-note - IDEFICS 2 vision to text adapter architecture

6

of 9

6 of 9

Side-note - IDEFICS 2 vision to text adapter architecture

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Fine-tuning Multi-modal Video and Text Models