HypR: A comprehensive study for ASR hypothesis revising with a reference corpus

09/18/2023
by   Yi-Wei Wang, et al.
0

With the development of deep learning, automatic speech recognition (ASR) has made significant progress. To further enhance the performance, revising recognition results is one of the lightweight but efficient manners. Various methods can be roughly classified into N-best reranking methods and error correction models. The former aims to select the hypothesis with the lowest error rate from a set of candidates generated by ASR for a given input speech. The latter focuses on detecting recognition errors in a given hypothesis and correcting these errors to obtain an enhanced result. However, we observe that these studies are hardly comparable to each other as they are usually evaluated on different corpora, paired with different ASR models, and even use different datasets to train the models. Accordingly, we first concentrate on releasing an ASR hypothesis revising (HypR) dataset in this study. HypR contains several commonly used corpora (AISHELL-1, TED-LIUM 2, and LibriSpeech) and provides 50 recognition hypotheses for each speech utterance. The checkpoint models of the ASR are also published. In addition, we implement and compare several classic and representative methods, showing the recent research progress in revising speech recognition results. We hope the publicly available HypR dataset can become a reference benchmark for subsequent research and promote the school of research to an advanced level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2021

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Automatic Speech recognition (ASR) is a complex and challenging task. In...
research
03/29/2022

Earnings-22: A Practical Benchmark for Accents in the Wild

Modern automatic speech recognition (ASR) systems have achieved superhum...
research
08/09/2022

Thai Wav2Vec2.0 with CommonVoice V8

Recently, Automatic Speech Recognition (ASR), a system that converts aud...
research
01/10/2022

Cross-Modal ASR Post-Processing System for Error Correction and Utterance Rejection

Although modern automatic speech recognition (ASR) systems can achieve h...
research
03/28/2022

Filler Word Detection and Classification: A Dataset and Benchmark

Filler words such as `uh' or `um' are sounds or words people use to sign...
research
06/11/2023

Impact of Experiencing Misrecognition by Teachable Agents on Learning and Rapport

While speech-enabled teachable agents have some advantages over typing-b...
research
08/16/2023

An Ambient Intelligence-based Approach For Longitudinal Monitoring of Verbal and Vocal Depression Symptoms

Automatic speech recognition (ASR) technology can aid in the detection, ...

Please sign up or login with your details

Forgot password? Click here to reset