Semi-Supervised Speech Recognition via Local Prior Matching

02/24/2020
by   Wei-Ning Hsu, et al.
0

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability. In this work, we propose local prior matching (LPM), a semi-supervised objective that distills knowledge from a strong prior (e.g. a language model) to provide learning signal to a discriminative model trained on unlabeled speech. We demonstrate that LPM is theoretically well-motivated, simple to implement, and superior to existing knowledge distillation techniques under comparable settings. Starting from a baseline trained on 100 hours of labeled speech, with an additional 360 hours of unlabeled data, LPM recovers 54 clean and noisy test sets relative to a fully supervised model on the same data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2019

Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition

We propose a novel approach to semi-supervised automatic speech recognit...
research
08/23/2023

KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

Despite recent availability of large transcribed Kinyarwanda speech data...
research
07/06/2019

Improved low-resource Somali speech recognition by semi-supervised acoustic and language model training

We present improvements in automatic speech recognition (ASR) for Somali...
research
01/02/2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

We introduce VoxPopuli, a large-scale multilingual corpus providing 100K...
research
11/20/2009

Likelihood-based semi-supervised model selection with applications to speech processing

In conventional supervised pattern recognition tasks, model selection is...
research
11/01/2022

The Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning

Semi-supervised machine learning (SSL) is gaining popularity as it reduc...
research
05/16/2020

Large scale weakly and semi-supervised learning for low-resource video ASR

Many semi- and weakly-supervised approaches have been investigated for o...

Please sign up or login with your details

Forgot password? Click here to reset