Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive technical video lecture breaking down OpenAI's Sora model and the underlying Diffusion Transformer technology that powers it. Learn the fundamental concepts behind diffusion models, U-Net architecture, auto encoders, and latent diffusion models before diving deep into the Diffusion Transformer architecture and its variations. Master key technical aspects including patch scaling versus model size relationships and Fréchet Inception Distance (FID) metrics through practical examples. Gain valuable insights into cutting-edge AI development through detailed explanations supported by academic papers, with links to additional resources including the Generative Deep Learning book and relevant research publications. Connect with the AI community through provided Discord and community channels while accessing supplementary materials like the Road to Sora reading list for continued learning.
Syllabus
Road to Sora
Intro to Diffusion Transformer
What is a Diffusion Model?
What is a U-Net?
Auto Encoder
Latent Diffusion Models
Diffusion Transformer Architecture
Variations on the Diffusion Transformer
Scaling Patch vs. Model Size
FID
Examples
Taught by
Oxen