Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Lumiere: Space-Time Diffusion Model for Video Generation

Yannic Kilcher via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a detailed explanation of Google Research's Lumiere, a groundbreaking text-to-video diffusion model designed to generate realistic and coherent motion in synthesized videos. Dive into the innovative Space-Time U-Net architecture that enables the creation of entire video durations in a single pass, overcoming limitations of existing keyframe-based approaches. Learn about the model's ability to process videos at multiple space-time scales, its state-of-the-art performance in text-to-video generation, and its versatility in various content creation tasks. Examine the technical aspects, including temporal down- and up-sampling, leveraging pre-trained text-to-image models, and applications such as image-to-video conversion, video inpainting, and stylized generation. Gain insights into the training, evaluation, and potential societal impacts of this cutting-edge technology in the field of AI-driven video synthesis.

Syllabus

- Introduction
- Problems with keyframes
- Space-Time U-Net STUNet
- Extending U-Nets to video
- Multidiffusion for SSR prediction fusing
- Stylized generation by swapping weights
- Training & Evaluation
- Societal Impact & Conclusion

Taught by

Yannic Kilcher

Reviews

Start your review of Lumiere: Space-Time Diffusion Model for Video Generation

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.