Preprocessing Audio Datasets for Machine Learning

Overview

Build an audio preprocessing pipeline for AI applications in Python through this comprehensive tutorial video. Learn to batch preprocess audio files by applying Short-Time Fourier Transform, zero-padding, and min-max normalization in a single operation. Explore the Free Spoken Digit Dataset and gain insights into pipeline design and implementation. Follow along as the instructor guides you through creating various components, including Loader, Padder, LogSpectrogramExtractor, MinMaxNormaliser, and Saver classes. Discover how to integrate these elements into a cohesive preprocessing pipeline, and understand the process of pruning for optimization. By the end of this 59-minute session, acquire the skills to efficiently prepare audio datasets for machine learning applications.

Syllabus

Intro
The Free Spoken Digit Dataset
Pipeline intuition + design
Implementating Loader
Implementing Padder
Implementing LogSpectrogramExtractor
Implementing MinMaxNormaliser
Implementing Preprocessing Pipeline
Implementing Saver
Recap of implemented classes
Prunning the preprocessing pipeline
Outro

Taught by

Valerio Velardo - The Sound of AI

Reviews

Start your review of Preprocessing Audio Datasets for Machine Learning

Taught by

Understanding Audio Data for Deep Learning

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

10 Best Python Courses for 2024: Charming the Snake

Never Stop Learning.