Explore a groundbreaking generative model for music capable of creating entire songs with remarkable quality and consistency. Delve into the Jukebox model, which can be conditioned on genre, artist, and even lyrics. Examine how this innovative approach tackles the long context of raw audio using a multiscale VQ-VAE to compress it to discrete codes, and models those using autoregressive Transformers. Discover how the combined model at scale generates high-fidelity and diverse songs with coherence up to multiple minutes. Learn about the ability to steer musical and vocal styles through artist and genre conditioning, and control singing through unaligned lyrics. Gain insights into the model's architecture, training process, and potential applications in the field of AI-generated music.
Overview
Syllabus
Jukebox: A Generative Model for Music (Paper Explained)
Taught by
Yannic Kilcher