Completed
Problem! Sampling Breaks Backprop
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Neural Nets for NLP - Latent Random Variables
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Discriminative vs. Generative Models • Discriminative model: calculate the probability of output given
- 3 Quiz: What Types of Variables? • In the an attentional sequence-to-sequence model using MLE/teacher forcing, are the following variables observed or latent? deterministic or random?
- 4 Why Latent Random Variable
- 5 What is Latent Random Variable Model
- 6 A Latent Variable Model
- 7 An Example (Goersch 2016)
- 8 Variational Inference
- 9 Practice
- 10 Variational Autoencoders
- 11 VAE vs. AE
- 12 Problem! Sampling Breaks Backprop
- 13 Solution: Re-parameterization Trick
- 14 Motivation for Latent Variables • Allows for a consistent latent space of sentences?
- 15 Difficulties in Training
- 16 KL Divergence Annealing • Basic idea: Multiply KL term by a constant starting at zero, then gradually increase to 1 • Result: model can learn to use z before getting penalized
- 17 Solution 2: Weaken the Decoder . But theoretically still problematic: it can be shown that the optimal strategy is to ignore z when it is not necessary (Chen et al. 2017)
- 18 Aggressive Inference Network Learning
- 19 Discrete Latent Variables?
- 20 Enumeration
- 21 Method 2: Sampling • Randomly sample a subset of configurations of z and optimize with respect to this subset
- 22 Method 3: Reparameterization (Maddison et al. 2017, Jang et al. 2017)
- 23 Variational Models of Language Processing (Miao et al. 2016) • Present models with random variables for document modeling and question answer pair selection
- 24 Controllable Text Generation (Hu et al. 2017)
- 25 Symbol Sequence Latent Variables (Miao and Blunsom 2016) • Encoder-decoder with a sequence of latent symbols