Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Understanding Text-to-Video Diffusion Models - From Core Concepts to Latest Developments

Neural Breakdown with AVB via YouTube

Overview

Explore a comprehensive 13-minute video lecture that delves into the cutting-edge world of Video Diffusion Generative AI models. Learn about the fundamental challenges and solutions in text-to-video generation while examining influential papers from major tech companies including Google's Imagen, Meta's Make-a-video, Nvidia's Video Latent Diffusion Model, and OpenAI's SORA. Master essential concepts of Image Diffusion models, including Forward and Reverse Diffusion, UNet, convolution, and diffusion transformers, while gaining insights into the evolution of video generation technology through key developments like VDM (2022), Factorized 3D Unet models, and various industry implementations. Access supplementary materials including related videos on conditional image diffusion models, latent space, and LLM image generation, along with an extensive collection of research papers and technical resources for deeper understanding.

Syllabus

- Intro
- Text to Image Conditional Diffusion Models
- Challenges with Video Diffusion Models
- VDM 2022
- Factorized 3D Unet models
- Meta Make A Video
- Google Imagen Video
- Nvidia Video LDM
- OpenAI SORA

Taught by

Neural Breakdown with AVB

Reviews

Start your review of Understanding Text-to-Video Diffusion Models - From Core Concepts to Latest Developments

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.