Investigation of Training Label Error Impact on RNN-T

12/01/2021
by   I-Fan Chen, et al.
0

In this paper, we propose an approach to quantitatively analyze impacts of different training label errors to RNN-T based ASR models. The result shows deletion errors are more harmful than substitution and insertion label errors in RNN-T training data. We also examined label error impact mitigation approaches on RNN-T and found that, though all the methods mitigate the label-error-caused degradation to some extent, they could not remove the performance gap between the models trained with and without the presence of label errors. Based on the analysis results, we suggest to design data pipelines for RNN-T with higher priority on reducing deletion label errors. We also find that ensuring high-quality training labels remains important, despite of the existence of the label error mitigation approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2022

CTRL: Clustering Training Losses for Label Error Detection

In supervised machine learning, use of correct labels is extremely impor...
research
09/08/2019

Order-free Learning Alleviating Exposure Bias in Multi-label Classification

Multi-label classification (MLC) assigns multiple labels to each sample....
research
04/02/2019

Impact of ASR on Alzheimer's Disease Detection: All Errors are Equal, but Deletions are More Equal than Others

Automatic Speech Recognition (ASR) is a critical component of any fully-...
research
10/30/2022

Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training

Training state-of-the-art ASR systems such as RNN-T often has a high ass...
research
10/02/2020

Deep Learning for Earth Image Segmentation based on Imperfect Polyline Labels with Annotation Errors

In recent years, deep learning techniques (e.g., U-Net, DeepLab) have ac...
research
06/12/2018

Imperfect Segmentation Labels: How Much Do They Matter?

Labeled datasets for semantic segmentation are imperfect, especially in ...
research
10/30/2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks

The standard approach to assess reliability of automatic speech transcri...

Please sign up or login with your details

Forgot password? Click here to reset