Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Meta's groundbreaking research in a 16-minute video explaining the Mixture-of-Transformers (MoT) paper, which introduces a novel approach to handling multi-modal AI tasks. Dive into the evolution of transformer models beyond text-only applications to effectively process combinations of text, speech, images, and videos. Learn about the MoT architecture as a drop-in replacement for traditional transformers, understanding its motivation, detailed algorithm implementation, and comprehensive evaluation results. Follow along with clearly marked timestamps as the video breaks down the architecture overview, empirical analysis, and real-world performance metrics of this innovative approach to multi-modal foundation models.
Syllabus
- Intro
- Motivation
- Mixture-of-Transformers Architecture Overview
- MoT Algorithm
- Evaluation
- Empirical Analysis
- Extro
Taught by
AI Bites