Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Build an audio preprocessing pipeline for AI applications in Python through this comprehensive tutorial video. Learn to batch preprocess audio files by applying Short-Time Fourier Transform, zero-padding, and min-max normalization in a single operation. Explore the Free Spoken Digit Dataset and gain insights into pipeline design and implementation. Follow along as the instructor guides you through creating various components, including Loader, Padder, LogSpectrogramExtractor, MinMaxNormaliser, and Saver classes. Discover how to integrate these elements into a cohesive preprocessing pipeline, and understand the process of pruning for optimization. By the end of this 59-minute session, acquire the skills to efficiently prepare audio datasets for machine learning applications.
Syllabus
Intro
The Free Spoken Digit Dataset
Pipeline intuition + design
Implementating Loader
Implementing Padder
Implementing LogSpectrogramExtractor
Implementing MinMaxNormaliser
Implementing Preprocessing Pipeline
Implementing Saver
Recap of implemented classes
Prunning the preprocessing pipeline
Outro
Taught by
Valerio Velardo - The Sound of AI