Towards Improved Room Impulse Response Estimation for Speech Recognition

11/08/2022
by   Anton Ratnarajah, et al.
0

We propose to characterize and improve the performance of blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR). We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators. We then propose a GAN-based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features, and uses a novel energy decay relief loss to optimize for capturing energy-based properties of the input reverberant speech. We show that our model outperforms the state-of-the-art baselines on acoustic benchmarks (by 72 energy metric), as well as in an ASR evaluation task (by 6.9 rate).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2021

FAST-RIR: Fast neural diffuse room impulse response generator

We present a neural-network-based fast diffuse room impulse response gen...
research
11/01/2019

Predicting word error rate for reverberant speech

Reverberation negatively impacts the performance of automatic speech rec...
research
01/22/2021

Exploiting Beam Search Confidence for Energy-Efficient Speech Recognition

With computers getting more and more powerful and integrated in our dail...
research
03/24/2022

Computing Optimal Location of Microphone for Improved Speech Recognition

It was shown in our earlier work that the measurement error in the micro...
research
03/31/2021

TS-RIR: Translated synthetic room impulse responses for speech augmentation

We present a method for improving the quality of synthetic room impulse ...
research
10/03/2022

Efficient acoustic feature transformation in mismatched environments using a Guided-GAN

We propose a new framework to improve automatic speech recognition (ASR)...
research
03/10/2023

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

Automatic Speech Recognition (ASR) in medical contexts has the potential...

Please sign up or login with your details

Forgot password? Click here to reset