Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer

07/29/2022
by   Cong-Thanh Do, et al.
0

This paper proposes a new approach to perform unsupervised fine-tuning and self-training using unlabeled speech data for recurrent neural network (RNN)-Transducer (RNN-T) end-to-end (E2E) automatic speech recognition (ASR) systems. Conventional systems perform fine-tuning/self-training using ASR hypothesis as the targets when using unlabeled audio data and are susceptible to the ASR performance of the base model. Here in order to alleviate the influence of ASR errors while using unlabeled data, we propose a multiple-hypothesis RNN-T loss that incorporates multiple ASR 1-best hypotheses into the loss function. For the fine-tuning task, ASR experiments on Librispeech show that the multiple-hypothesis approach achieves a relative reduction of 14.2 approach, on the test_other set. For the self-training task, ASR models are trained using supervised data from Wall Street Journal (WSJ), Aurora-4 along with CHiME-4 real noisy data as unlabeled data. The multiple-hypothesis approach yields a relative reduction of 3.3 single-channel real noisy evaluation set when compared with the single-hypothesis approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2021

Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition

This paper proposes an adaptation method for end-to-end speech recogniti...
research
12/03/2022

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models

Self-supervised learning (SSL) has been able to leverage unlabeled data ...
research
11/23/2020

Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems

Today, many state-of-the-art automatic speech recognition (ASR) systems ...
research
12/10/2021

Sequence-level self-learning with multiple hypotheses

In this work, we develop new self-learning techniques with an attention-...
research
08/14/2023

O-1: Self-training with Oracle and 1-best Hypothesis

We introduce O-1, a new self-training objective to reduce training bias ...
research
05/21/2011

Is the Multiverse Hypothesis capable of explaining the Fine Tuning of Nature Laws and Constants? The Case of Cellular Automata

The objective of this paper is analyzing to which extent the multiverse ...
research
01/20/2021

The JHU ASR System for VOiCES from a Distance Challenge 2019

This paper describes the system developed by the JHU team for automatic ...

Please sign up or login with your details

Forgot password? Click here to reset