Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

04/17/2019
by   Jiabin Xue, et al.
0

It is an effective way that improves the performance of the existing Automatic Speech Recognition (ASR) systems by retraining with more and more new training data in the target domain. Recently, Deep Neural Network (DNN) has become a successful model in the ASR field. In the training process of the DNN based methods, a back propagation of error between the transcription and the corresponding annotated text is used to update and optimize the parameters. Thus, the parameters are more influenced by the training samples with a big propagation error than the samples with a small one. In this paper, we define the samples with significant error as the hard samples and try to improve the performance of the ASR system by adding many of them. Unfortunately, the hard samples are sparse in the training data of the target domain, and manually label them is expensive. Therefore, we propose a hard samples mining method based on an enhanced deep multiple instance learning, which can find the hard samples from unlabeled training data by using a small subset of the dataset with manual labeling in the target domain. We applied our method to an End2End ASR task and obtained the best performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

research
11/09/2022

Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition

Noisy Student Training (NST) has recently demonstrated extremely strong ...
research
06/02/2021

Improving low-resource ASR performance with untranscribed out-of-domain data

Semi-supervised training (SST) is a common approach to leverage untransc...
research
04/17/2019

A Multi-Task Learning Framework for Overcoming the Catastrophic Forgetting in Automatic Speech Recognition

Recently, data-driven based Automatic Speech Recognition (ASR) systems h...
research
06/01/2023

Adapting an Unadaptable ASR System

As speech recognition model sizes and training data requirements grow, i...
research
02/22/2023

MADI: Inter-domain Matching and Intra-domain Discrimination for Cross-domain Speech Recognition

End-to-end automatic speech recognition (ASR) usually suffers from perfo...
research
06/19/2020

Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization

The cost of labeling transcriptions for large speech corpora becomes a b...
research
07/02/2018

weight-importance sparse training in keyword spotting

Large size models are implemented in recently ASR system to deal with co...

Please sign up or login with your details

Forgot password? Click here to reset