Biased Self-supervised learning for ASR

11/04/2022
by   Florian L. Kreyssig, et al.
0

Self-supervised learning via masked prediction pre-training (MPPT) has shown impressive performance on a range of speech-processing tasks. This paper proposes a method to bias self-supervised learning towards a specific task. The core idea is to slightly finetune the model that is used to obtain the target sequence. This leads to better performance and a substantial increase in training speed. Furthermore, this paper proposes a variant of MPPT that allows low-footprint streaming models to be trained effectively by computing the MPPT loss on masked and unmasked frames. These approaches are evaluated for automatic speech recognition on the Librispeech corpus, where 100 hours of data served as the labelled data and 860 hours as the unlabelled data. The biased training outperforms the unbiased training by 15.5 23.8 pre-training approach yields a reduction in word error rate of 44.1

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2021

Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning

Recently, self-supervised pre-training has gained success in automatic s...
research
12/07/2022

Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit

Speech pre-training has shown great success in learning useful and gener...
research
06/11/2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute

Self-supervised learning (SSL) has led to great strides in speech proces...
research
11/02/2022

More Speaking or More Speakers?

Self-training (ST) and self-supervised learning (SSL) methods have demon...
research
05/17/2022

Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Self-supervised learning (SSL) methods have proven to be very successful...
research
05/13/2021

Using Self-Supervised Co-Training to Improve Facial Representation

In this paper, at first, the impact of ImageNet pre-training on Facial E...
research
03/03/2022

A study on the distribution of social biases in self-supervised learning visual models

Deep neural networks are efficient at learning the data distribution if ...

Please sign up or login with your details

Forgot password? Click here to reset