Overview
Explore a 23-minute technical video analysis breaking down Meta's latest video generation model, MovieGen, and its research paper. Dive deep into the model's sophisticated architecture, training methodology, and ambitious goals for AI-powered content creation. Learn about key components including the temporal auto-encoder architecture, transformer backbone, training objectives with outlier penalty loss, and the complete pipeline from pre-training through inference. Understand the technical intricacies of tiled inference, superresolution modeling, parallelism in training, and the multi-stage training approach. Presented by an experienced machine learning researcher with 15 years of software engineering background and expertise in computer vision and robotics, gain valuable insights into how this cutting-edge technology aims to revolutionize movie generation through artificial intelligence.
Syllabus
- Intro
- Paper Intro
- Training recipe overview
- Image and Video generation pipeline
- Temporal Auto-Encoder architecture
- Transformer backbone architecture
- Training ObjectiveOutlier Penalty Loss
- Tiled inference
- Superresolution model
- Training setting
- Parallelism for training
- Pre-training data
- Multi-stage training
- Fine-tuning
- Inference
- Extro
Taught by
AI Bites