Overview
Watch a 10-minute technical video explaining Meta's Emu model architecture and its innovative approach to enhancing image generation capabilities. Learn about the modified Latent Diffusion Model architecture powering Emu Edit and Emu Video, which enables precise object manipulation in images and videos. Explore the detailed quality fine-tuning procedure and dataset curation process that contributes to Emu's exceptional image generation results. Delve into key components including U-Net modifications, pre-training methodology, automatic and manual data curation techniques, evaluation metrics, and how quality tuning principles can be applied to other models. Created by an experienced machine learning researcher, this comprehensive breakdown includes relevant links to papers, projects, and additional resources for deeper understanding of this foundational AI model.
Syllabus
- Intro
- AI Bites Twitter https://twitter.com/ai_bites
- Approach
- Architecture
- U-Net Modifications
- Pre-Training
- Fine Tuning Quality Tuning
- Fine Tuning data curation automatic
- Fine Tuning data curation manual
- Evaluation
- Quality Tuning to other models
Taught by
AI Bites