Understanding Text-to-Video Diffusion Models - From Core Concepts to Latest Developments

Overview

Explore a comprehensive 13-minute video lecture that delves into the cutting-edge world of Video Diffusion Generative AI models. Learn about the fundamental challenges and solutions in text-to-video generation while examining influential papers from major tech companies including Google's Imagen, Meta's Make-a-video, Nvidia's Video Latent Diffusion Model, and OpenAI's SORA. Master essential concepts of Image Diffusion models, including Forward and Reverse Diffusion, UNet, convolution, and diffusion transformers, while gaining insights into the evolution of video generation technology through key developments like VDM (2022), Factorized 3D Unet models, and various industry implementations. Access supplementary materials including related videos on conditional image diffusion models, latent space, and LLM image generation, along with an extensive collection of research papers and technical resources for deeper understanding.

Syllabus

- Intro
- Text to Image Conditional Diffusion Models
- Challenges with Video Diffusion Models
- VDM 2022
- Factorized 3D Unet models
- Meta Make A Video
- Google Imagen Video
- Nvidia Video LDM
- OpenAI SORA

Taught by

Neural Breakdown with AVB

Reviews

Start your review of Understanding Text-to-Video Diffusion Models - From Core Concepts to Latest Developments

Taught by

Understanding Diffusion Transformers and the Technology Behind Sora

Text to Image AI Models - Different Methodologies and How It Works

Lumiere: Space-Time Diffusion Model for Text-to-Video Generation

Stable Diffusion and Friends - High-Resolution Image Synthesis via Two-Stage Generative Models

Imagen: Text-to-Image Generation Using Diffusion Models - Lecture 9

Diffusion Models - PyTorch Implementation

10 Best Deep Learning Courses for 2024

Never Stop Learning.