Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Mixture of Transformers for Multi-modal Foundation Models

AI Bites via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Meta's groundbreaking research in a 16-minute video explaining the Mixture-of-Transformers (MoT) paper, which introduces a novel approach to handling multi-modal AI tasks. Dive into the evolution of transformer models beyond text-only applications to effectively process combinations of text, speech, images, and videos. Learn about the MoT architecture as a drop-in replacement for traditional transformers, understanding its motivation, detailed algorithm implementation, and comprehensive evaluation results. Follow along with clearly marked timestamps as the video breaks down the architecture overview, empirical analysis, and real-world performance metrics of this innovative approach to multi-modal foundation models.

Syllabus

- Intro
- Motivation
- Mixture-of-Transformers Architecture Overview
- MoT Algorithm
- Evaluation
- Empirical Analysis
- Extro

Taught by

AI Bites

Reviews

Start your review of Mixture of Transformers for Multi-modal Foundation Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.