DNN adaptation by automatic quality estimation of ASR hypotheses

02/06/2017
by   Daniele Falavigna, et al.
0

In this paper we propose to exploit the automatic Quality Estimation (QE) of ASR hypotheses to perform the unsupervised adaptation of a deep neural network modeling acoustic probabilities. Our hypothesis is that significant improvements can be achieved by: i)automatically transcribing the evaluation data we are currently trying to recognise, and ii) selecting from it a subset of "good quality" instances based on the word error rate (WER) scores predicted by a QE component. To validate this hypothesis, we run several experiments on the evaluation data sets released for the CHiME-3 challenge. First, we operate in oracle conditions in which manual transcriptions of the evaluation data are available, thus allowing us to compute the "true" sentence WER. In this scenario, we perform the adaptation with variable amounts of data, which are characterised by different levels of quality. Then, we move to realistic conditions in which the manual transcriptions of the evaluation data are not available. In this case, the adaptation is performed on data selected according to the WER scores "predicted" by a QE component. Our results indicate that: i) QE predictions allow us to closely approximate the adaptation results obtained in oracle conditions, and ii) the overall ASR performance based on the proposed QE-driven adaptation method is significantly better than the strong, most recent, CHiME-3 baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2020

DNN-Based Semantic Model for Rescoring N-best Speech Recognition List

The word error rate (WER) of an automatic speech recognition (ASR) syste...
research
06/22/2017

Automatic Quality Estimation for ASR System Combination

Recognizer Output Voting Error Reduction (ROVER) has been widely used fo...
research
03/29/2021

Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition

This paper proposes an adaptation method for end-to-end speech recogniti...
research
06/30/2021

Sequence-level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models

Scores from traditional confidence classifiers (CCs) in automatic speech...
research
07/08/2019

Listen, Attend, Spell and Adapt: Speaker Adapted Sequence-to-Sequence ASR

Sequence-to-sequence (seq2seq) based ASR systems have shown state-of-the...
research
12/10/2021

Sequence-level self-learning with multiple hypotheses

In this work, we develop new self-learning techniques with an attention-...
research
05/09/2017

Making up for the deficit in a marathon run

To predict the final result of an athlete in a marathon run thoroughly i...

Please sign up or login with your details

Forgot password? Click here to reset