Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Trelis Research via YouTube Direct link

Evaluating multiple image inputs

6 of 11

6 of 11

Evaluating multiple image inputs

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Fine-tuning and server setup for multi-modal models
  2. 2 Prerequisites pre-watching
  3. 3 IDEFICS 2 Model Overview
  4. 4 Model loading, evaluation and LoRA setup
  5. 5 Evaluating OCR performance
  6. 6 Evaluating multiple image inputs
  7. 7 Training / Fine-tuning
  8. 8 LLaVA Llama 3 Model Review
  9. 9 Multi-modal inference endpoint
  10. 10 VRAM Requirements for multi-modal models
  11. 11 IDEFICS 2 - my recommended model to build on

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.