Understanding Text Generation Through Diffusion Models - From Theory to Implementation
Overview
Explore a 42-minute technical video diving deep into Discrete Diffusion Modeling through the examination of a research paper that presents a text generation technique rivaling GPT-2. Learn about probability distributions in generative AI, starting with fundamental challenges like the absence of a black box solution and the intractable normalizing constant. Progress through various solution approaches, from network-based probability mass function approximation to autoregressive modeling, before focusing on the core innovation of modeling scores rather than probability mass. Understand how concrete scores are learned through diffusion, examine the evaluation methods, and grasp the practical implications and key takeaways of this text generation approach. Perfect for AI researchers and practitioners interested in alternative approaches to language modeling and text generation.
Syllabus
Intro
Modeling Probability Distributions for Generative AI
Problem #1: No Black Box
Solution #1: Train a Network to Approximate the Probability Mass Function
Problem #2: The Normalizing Constant, Z_theta, is Intractable
Solution #2: Autoregressive Modeling
Solution #3 Real Solution: Model Score, Not Probability Mass
Learning the Concrete Score Through Diffusion
Evaluation
So What?
Takeaways
Taught by
Oxen