Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition

10/29/2021
by   Chak-Fai Li, et al.
0

Recent advances in unsupervised representation learning have demonstrated the impact of pretraining on large amounts of read speech. We adapt these techniques for domain adaptation in low-resource – both in terms of data and compute – conversational and broadcast domains. Moving beyond CTC, we pretrain state-of-the-art Conformer models in an unsupervised manner. While the unsupervised approach outperforms traditional semi-supervised training, the techniques are complementary. Combining the techniques is a 5 improvement in WER, averaged over all conditions, compared to semi-supervised training alone. Additional text data is incorporated through external language models. By using CTC-based decoding, we are better able to take advantage of the additional text data. When used as a transcription model, it allows the Conformer model to better incorporate the knowledge from the language model through semi-supervised training than shallow fusion. Final performance is an additional 2 training compared to shallow fusion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2022

Improving Deliberation by Text-Only and Semi-Supervised Training

Text-only and semi-supervised training based on audio-only data has gain...
research
07/24/2022

A Deep Dive into Deep Cluster

Deep Learning has demonstrated a significant improvement against traditi...
research
05/30/2019

Lattice-based lightly-supervised acoustic model training

In the broadcast domain there is an abundance of related text data and p...
research
10/27/2022

Training Autoregressive Speech Recognition Models with Limited in-domain Supervision

Advances in self-supervised learning have significantly reduced the amou...
research
04/01/2016

Semi-supervised and Unsupervised Methods for Categorizing Posts in Web Discussion Forums

Web discussion forums are used by millions of people worldwide to share ...
research
04/14/2021

Large-Scale Self- and Semi-Supervised Learning for Speech Translation

In this paper, we improve speech translation (ST) through effectively le...
research
10/01/2020

SESQA: semi-supervised learning for speech quality assessment

Automatic speech quality assessment is an important, transversal task wh...

Please sign up or login with your details

Forgot password? Click here to reset