Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into the theory behind Convolutional Neural Networks (CNNs) in this 35-minute video tutorial. Explore key concepts such as convolutions, pooling, and CNN architectures, with a focus on their application to audio data. Learn about zero padding, architectural decisions for convolution, grid size, depth, number of kernels, and pooling settings. Understand max pooling and its implementation. Discover how convolution and pooling apply specifically to audio processing, including the preparation of MFCCs for CNN input. Access accompanying slides for visual reference and join a community of AI enthusiasts for further discussion and networking opportunities.
Syllabus
Intro
Intuition
CNN components
Convolution: Zero padding
Architectural decisions for convolution
Grid size
Depth
# of kernels
Pooling settings
Max pooling (2x2, stride 2)
CNN architecture
How does convolution/pooling apply to audio?
Preparing MFCCs for a CNN
What's up next?
Taught by
Valerio Velardo - The Sound of AI