Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition

08/08/2019
by   Subhadeep Dey, et al.
3

In this paper, we explore various approaches for semi supervised learning in an end to end automatic speech recognition (ASR) framework. The first step in our approach involves training a seed model on the limited amount of labelled data. Additional unlabelled speech data is employed through a data selection mechanism to obtain the best hypothesized output, further used to retrain the seed model. However, uncertainties of the model may not be well captured with a single hypothesis. As opposed to this technique, we apply a dropout mechanism to capture the uncertainty by obtaining multiple hypothesized text transcripts of an speech recording. We assume that the diversity of automatically generated transcripts for an utterance will implicitly increase the reliability of the model. Finally, the data selection process is also applied on these hypothesized transcripts to reduce the uncertainty. Experiments on freely available TEDLIUM corpus and proprietary Adobe's internal dataset show that the proposed approach significantly reduces ASR errors, compared to the baseline model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2021

Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition

This paper proposes an adaptation method for end-to-end speech recogniti...
research
10/31/2018

End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator

The speech chain mechanism integrates automatic speech recognition (ASR)...
research
10/29/2020

Semi-Supervised Speech Recognition via Graph-based Temporal Classification

Semi-supervised learning has demonstrated promising results in automatic...
research
10/11/2021

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy

Pseudo-labeling (PL), a semi-supervised learning (SSL) method where a se...
research
07/07/2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

We propose a semi-supervised learning method for building end-to-end ric...
research
10/20/2022

Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses

We propose a novel method that combines CycleGAN and inter-domain losses...
research
02/12/2021

Multimodal Punctuation Prediction with Contextual Dropout

Automatic speech recognition (ASR) is widely used in consumer electronic...

Please sign up or login with your details

Forgot password? Click here to reset