Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Preprocessing Audio Datasets for Machine Learning

Valerio Velardo - The Sound of AI via YouTube

Overview

Build an audio preprocessing pipeline for AI applications in Python through this comprehensive tutorial video. Learn to batch preprocess audio files by applying Short-Time Fourier Transform, zero-padding, and min-max normalization in a single operation. Explore the Free Spoken Digit Dataset and gain insights into pipeline design and implementation. Follow along as the instructor guides you through creating various components, including Loader, Padder, LogSpectrogramExtractor, MinMaxNormaliser, and Saver classes. Discover how to integrate these elements into a cohesive preprocessing pipeline, and understand the process of pruning for optimization. By the end of this 59-minute session, acquire the skills to efficiently prepare audio datasets for machine learning applications.

Syllabus

Intro
The Free Spoken Digit Dataset
Pipeline intuition + design
Implementating Loader
Implementing Padder
Implementing LogSpectrogramExtractor
Implementing MinMaxNormaliser
Implementing Preprocessing Pipeline
Implementing Saver
Recap of implemented classes
Prunning the preprocessing pipeline
Outro

Taught by

Valerio Velardo - The Sound of AI

Reviews

Start your review of Preprocessing Audio Datasets for Machine Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.