CLAP With Me: Step-by-Step Semantic Search on Audio Sources

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the innovative world of semantic search on audio sources in this 42-minute conference talk from Haystack US 2024. Dive into the concept of CLAP (Contrastive Language–Audio Pre-training), a groundbreaking approach that connects audio and text data in a single multimodal space. Learn how CLAP enables semantic search across audio data, similar to how CLIP works for images and text. Follow along as the speaker builds a small application that generates CLAP vector embeddings from audio files, indexes them to Opensearch, and implements semantic search queries over audio data. Gain insights into the basics of CLAP, its functionality, and its potential applications in enhancing search relevance for text queries on audio sources. Benefit from the speaker's experience as a search engineer and leader in search and personalization engineering, tackling the complex challenge of improving relevance in audio-based searches.