Improving Distinction between ASR Errors and Speech Disfluencies with Feature Space Interpolation

08/04/2021
by   Seongmin Park, et al.
0

Fine-tuning pretrained language models (LMs) is a popular approach to automatic speech recognition (ASR) error detection during post-processing. While error detection systems often take advantage of statistical language archetypes captured by LMs, at times the pretrained knowledge can hinder error detection performance. For instance, presence of speech disfluencies might confuse the post-processing system into tagging disfluent but accurate transcriptions as ASR errors. Such confusion occurs because both error detection and disfluency detection tasks attempt to identify tokens at statistically unlikely positions. This paper proposes a scheme to improve existing LM-based ASR error detection systems, both in terms of detection scores and resilience to such distracting auxiliary tasks. Our approach adopts the popular mixup method in text feature space and can be utilized with any black-box ASR output. To demonstrate the effectiveness of our method, we conduct post-processing experiments with both traditional and end-to-end ASR systems (both for English and Korean languages) with 5 different speech corpora. We find that our method improves both ASR error detection F 1 scores and reduces the number of correctly transcribed disfluencies wrongly detected as ASR errors. Finally, we suggest methods to utilize resulting LMs directly in semi-supervised ASR training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Diacritic Recognition Performance in Arabic ASR

We present an analysis of diacritic recognition performance in Arabic Au...
research
03/14/2022

RED-ACE: Robust Error Detection for ASR using Confidence Embeddings

ASR Error Detection (AED) models aim to post-process the output of Autom...
research
03/19/2023

Bangla Grammatical Error Detection Using T5 Transformer Model

This paper presents a method for detecting grammatical errors in Bangla ...
research
04/09/2020

Improving Readability for Automatic Speech Recognition Transcription

Modern Automatic Speech Recognition (ASR) systems can achieve high perfo...
research
02/22/2021

Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model

Modern Automatic Speech Recognition (ASR) systems can achieve high perfo...
research
04/12/2022

ASR in German: A Detailed Error Analysis

The amount of freely available systems for automatic speech recognition ...
research
07/04/2020

Robust Prediction of Punctuation and Truecasingfor Medical ASR

Automatic speech recognition (ASR) systems in the medical domain that fo...

Please sign up or login with your details

Forgot password? Click here to reset