Overview
Dive into a comprehensive 38-minute video tutorial on implementing Vector Quantized Generative Adversarial Networks (VQGAN) using PyTorch. Explore the two-stage process of VQGAN, starting with an autoencoder-like approach for encoding images into a low-dimensional latent space and applying vector quantization using a codebook. Learn about the fully convolutional encoder and decoder, and discover how to train a transformer for the latent space to generate novel images. Follow along with detailed explanations of helper modules, encoder, decoder, codebook, discriminator, and LPIPS. Gain insights into the training process for both stages, examine results, and understand the implementation of GPT and VQGAN Transformer. Access additional resources for further reading on related topics such as VAE, VQVAE, CNNs, NonLocal NN, PatchGAN, and Hinge Loss.
Syllabus
Introduction
Helper modules
Encoder
Decoder
Codebook
VQGAN
Discriminator
LPIPS
Utils
Training: First Stage
Results: First Stage
Introducing Second Stage
GPT
VQGAN Transformer
Training: Second Stage
Results: Second Stage
Github Code & Outro
Taught by
Outlier