Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Explore a groundbreaking approach to multi-channel multi-talker automatic speech recognition (ASR) in this 40-minute conference talk from the Center for Language & Speech Processing at JHU. Delve into the innovative technique of convolving overlapping speech signals with room impulse responses (RIR) to create a novel spatial feature called RIR-SF. Discover how this method outperforms the state-of-the-art 3D spatial feature, achieving a 21.3% relative reduction in Character Error Rate (CER) for multi-channel multi-talker ASR systems. Learn about the robustness of RIR-SF in highly reverberant environments and its potential to overcome limitations of existing approaches. Gain insights into the theoretical analysis and experimental results that demonstrate the superiority of this new spatial feature in addressing ongoing challenges in the speech recognition community.

Syllabus

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.