Stable Diffusion and Friends - High-Resolution Image Synthesis via Two-Stage Generative Models

Overview

Explore the evolution of generative image models in this insightful talk by Robin Rombach, co-creator of Stable Diffusion. Delve into the progression from GANs to Transformers and latent Diffusion models, gaining a comprehensive understanding of high-resolution image synthesis techniques. Learn about two-stage generative models, the QCVAE architecture, Vision Transformers, and the groundbreaking Stable Diffusion model. Discover applications in text-to-image generation, semantic synthesis, upscaling, and creative endeavors like text-to-color palette conversion and video stylization. Gain valuable insights from Rombach's extensive research experience and his pivotal role in developing widely-used projects such as VQGAN, Taming Transformers, and Latent Diffusion Models.

Syllabus

Introduction
Diffusion
TwoStage Generative Models
Leon Model
Why domain knowledge
QCVAE architecture
QCVAE reconstruction
VisionTransformers
VQan
HighResolution Image Synthesis
Text to Image Generation
Stable Diffusion
Classifier Free Diffusion Guidance
Stereo Fusion in Painting
Semantic Synthesis
Upscaling
SBEdit
Diffusion Model
Creative Applications
Text to Color Palette
Video stylization
Lexi Carlile
Credits
Questions
One Direction
Adding Numerology
Conclusion

Taught by

Hugging Face

Reviews

Start your review of Stable Diffusion and Friends - High-Resolution Image Synthesis via Two-Stage Generative Models

Taught by

Explaining Stable Diffusion for Machine Learning Engineers

Exploring and Exploiting Interpretable Semantics in GANs - CVPR 2020 Tutorial

Diffusion Models Beat GANs on Image Synthesis - Machine Learning Research Paper Explained

Diffusion Transformer and Rectified Flow in Stable Diffusion 3

Generative Adversarial Networks and Stable Diffusion

Text to Image AI Models - Different Methodologies and How It Works

Never Stop Learning.