Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition

11/09/2022
by   Yu Chen, et al.
0

Noisy Student Training (NST) has recently demonstrated extremely strong performance in Automatic Speech Recognition (ASR). In this paper, we propose a data selection strategy named LM Filter to improve the performances of NST on non-target domain data in ASR tasks. Hypothesis with and without Language Model are generated and CER differences between them are utilized as a filter threshold. Results reveal that significant improvements of 10.4 no data filtering baselines. We can achieve 3.31 which is best result from our knowledge without any other supervised data. We also perform evaluations on supervised 1000 hour AISHELL-2 dataset and competitive results of 4.72

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2023

Boosting Norwegian Automatic Speech Recognition

In this paper, we present several baselines for automatic speech recogni...
research
04/17/2019

Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

It is an effective way that improves the performance of the existing Aut...
research
05/19/2020

Improved Noisy Student Training for Automatic Speech Recognition

Recently, a semi-supervised learning method known as "noisy student trai...
research
06/02/2021

Improving low-resource ASR performance with untranscribed out-of-domain data

Semi-supervised training (SST) is a common approach to leverage untransc...
research
02/16/2023

Adaptable End-to-End ASR Models using Replaceable Internal LMs and Residual Softmax

End-to-end (E2E) automatic speech recognition (ASR) implicitly learns th...
research
05/10/2021

Voice activity detection in the wild: A data-driven approach using teacher-student training

Voice activity detection is an essential pre-processing component for sp...

Please sign up or login with your details

Forgot password? Click here to reset