ASR Error Detection via Audio-Transcript entailment

07/22/2022
by   Nimshi Venkat Meripo, et al.
0

Despite improved performances of the latest Automatic Speech Recognition (ASR) systems, transcription errors are still unavoidable. These errors can have a considerable impact in critical domains such as healthcare, when used to help with clinical documentation. Therefore, detecting ASR errors is a critical first step in preventing further error propagation to downstream applications. To this end, we propose a novel end-to-end approach for ASR error detection using audio-transcript entailment. To the best of our knowledge, we are the first to frame this problem as an end-to-end entailment task between the audio segment and its corresponding transcript segment. Our intuition is that there should be a bidirectional entailment between audio and transcript when there is no recognition error and vice versa. The proposed model utilizes an acoustic encoder and a linguistic encoder to model the speech and transcript respectively. The encoded representations of both modalities are fused to predict the entailment. Since doctor-patient conversations are used in our experiments, a particular emphasis is placed on medical terms. Our proposed model achieves classification error rates (CER) of 26.2 errors and 23 strong baseline by 12

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2022

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

Automatic speech recognition (ASR) systems typically rely on an external...
research
08/02/2021

Decoupling recognition and transcription in Mandarin ASR

Much of the recent literature on automatic speech recognition (ASR) is t...
research
11/03/2022

Streaming Audio-Visual Speech Recognition with Alignment Regularization

Recognizing a word shortly after it is spoken is an important requiremen...
research
06/10/2020

Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors

Speech-based virtual assistants, such as Amazon Alexa, Google assistant,...
research
11/11/2022

The Far Side of Failure: Investigating the Impact of Speech Recognition Errors on Subsequent Dementia Classification

Linguistic anomalies detectable in spontaneous speech have shown promise...
research
07/25/2020

MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition

Audio Adversarial Examples (AAE) represent specially created inputs mean...
research
05/23/2022

Calibrate and Refine! A Novel and Agile Framework for ASR-error Robust Intent Detection

The past ten years have witnessed the rapid development of text-based in...

Please sign up or login with your details

Forgot password? Click here to reset