Stanford Seminar - Audio Research: Transformers for Applications in Audio, Speech and Music
Stanford University via YouTube
-
23
-
- Write review
Overview
Syllabus
Introduction.
Transformers for Music and Audio: Language Modelling to Understanding to Synthesis.
The Transformer Revolution.
Models getting bigger ....
What are spectograms.
Raw Audio Synthesis: Difficulty Classical FM synthesis Karplus Strong.
Baseline : Classic WaveNet.
Improving Transformer Baseline • Major bottleneck of Transformers.
Results & Unconditioned Setup • Evaluation Criterion o Comparing Wavenet, Transformers on next sample prediction Top-5 accuracy, out of 256 possible states as a error metric Why this setup 7 1. Application agnostic 2. Suits training setup.
A Framework for Generative and Contrastive Learning of Audio Representations.
Acoustic Scene Understanding.
Recipe of doing.
Turbocharging best of two worlds Vector Quantization: A powerful and under-uilized algorithm Combining VQwih auto-encoders and Transformers.
Turbocharging best of two worlds Leaming clusters from vector quantization Use long term dependency kaming with that cluster based representation for markovian assumption Better we become in prediction, the better the summarization is.
Audio Transformers: Transformer Architectures for Large Scale Audio Understanding - Adieu Convolutions Stanford University March 2021.
Wavelets on Transformer Embeddings.
Methodology + Results.
What does it learn -- the front end.
Final Thoughts.
Taught by
Stanford Online