RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Overview

Explore a groundbreaking approach to multi-channel multi-talker automatic speech recognition (ASR) in this 40-minute conference talk from the Center for Language & Speech Processing at JHU. Delve into the innovative technique of convolving overlapping speech signals with room impulse responses (RIR) to create a novel spatial feature called RIR-SF. Discover how this method outperforms the state-of-the-art 3D spatial feature, achieving a 21.3% relative reduction in Character Error Rate (CER) for multi-channel multi-talker ASR systems. Learn about the robustness of RIR-SF in highly reverberant environments and its potential to overcome limitations of existing approaches. Gain insights into the theoretical analysis and experimental results that demonstrate the superiority of this new spatial feature in addressing ongoing challenges in the speech recognition community.

Syllabus

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

100 Most Popular Courses For October

Most common

Popular subjects

Popular courses

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Overview

Syllabus

Taught by

Reviews

100 Most Popular Courses For October

Taught by

Never Stop Learning.