Multi-Modal LLMs for Image, Sound and Video - Episode 6.3

Multi-Modal LLMs for Image, Sound and Video - Episode 6.3

Donato Capitella via YouTube Direct link

- Vision Transformer

3 of 6

3 of 6

- Vision Transformer

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Multi-Modal LLMs for Image, Sound and Video - Episode 6.3

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - MLLM Architecture
  2. 2 - Training MLLMs
  3. 3 - Vision Transformer
  4. 4 - Contrastive Learning CLIP, SigLIP
  5. 5 - Lab: PaliGemma
  6. 6 - Summary

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.