Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR

11/28/2018
by   Li Chai, et al.
0

One challenging problem of robust automatic speech recognition (ASR) is how to measure the goodness of a speech enhancement algorithm (SEA) without calculating the word error rate (WER) due to the high costs of manual transcriptions, language modeling and decoding process. Traditional measures like PESQ and STOI for evaluating the speech quality and intelligibility were verified to have relatively low correlations with WER. In this study, a novel acoustics-guided evaluation (AGE) measure is proposed for estimating performance of SEAs for robust ASR. AGE consists of three consecutive steps, namely the low-level representations via the feature extraction, high-level representations via the nonlinear mapping with the acoustic model (AM), and the final AGE calculation between the representations of clean speech and degraded speech. Specifically, state posterior probabilities from neural network based AM are adopted for the high-level representations and the cross-entropy criterion is used to calculate AGE. Experiments demonstrate AGE could yield consistently highest correlations with WER and give the most accurate estimation of ASR performance compared with PESQ, STOI, and acoustic confidence measure using Entropy. Potentially, AGE could be adopted to guide the parameter optimization of deep learning based SEAs to further improve the recognition performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2022

Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners

An accurate objective speech intelligibility prediction algorithms is of...
research
03/13/2019

Frequency Domain Multi-channel Acoustic Modeling for Distant Speech Recognition

Conventional far-field automatic speech recognition (ASR) systems typica...
research
06/18/2019

Deep Xi as a Front-End for Robust Automatic Speech Recognition

Front-end techniques for robust automatic speech recognition (ASR) have ...
research
10/29/2019

Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?

Automatic speech recognition (ASR) systems play a key role in many comme...
research
03/25/2022

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Compensation for channel mismatch and noise interference is essential fo...
research
09/26/2019

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription

Despite the strong modeling power of neural network acoustic models, spe...
research
03/17/2022

Prediction of speech intelligibility with DNN-based performance measures

This paper presents a speech intelligibility model based on automatic sp...

Please sign up or login with your details

Forgot password? Click here to reset