KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

08/23/2023
by   Antoine Nzeyimana, et al.
0

Despite recent availability of large transcribed Kinyarwanda speech data, achieving robust speech recognition for Kinyarwanda is still challenging. In this work, we show that using self-supervised pre-training, following a simple curriculum schedule during fine-tuning and using semi-supervised learning to leverage large unlabelled speech data significantly improve speech recognition performance for Kinyarwanda. Our approach focuses on using public domain data only. A new studio-quality speech dataset is collected from a public website, then used to train a clean baseline model. The clean baseline model is then used to rank examples from a more diverse and noisy public dataset, defining a simple curriculum training schedule. Finally, we apply semi-supervised learning to label and learn from large unlabelled data in four successive generations. Our final model achieves 3.2 15.9 best of our knowledge. Our experiments also indicate that using syllabic rather than character-based tokenization results in better speech recognition performance for Kinyarwanda.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2022

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

Recent studies have shown that the benefits provided by self-supervised ...
research
06/20/2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

We show for the first time that learning powerful representations from s...
research
02/24/2020

Semi-Supervised Speech Recognition via Local Prior Matching

For sequence transduction tasks like speech recognition, a strong struct...
research
10/20/2020

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

We employ a combination of recent developments in semi-supervised learni...
research
10/27/2022

Training Autoregressive Speech Recognition Models with Limited in-domain Supervision

Advances in self-supervised learning have significantly reduced the amou...
research
12/05/2022

SoftCTC x2013 Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

This paper explores semi-supervised training for sequence tasks, such as...
research
07/17/2018

Learning Noise-Invariant Representations for Robust Speech Recognition

Despite rapid advances in speech recognition, current models remain brit...

Please sign up or login with your details

Forgot password? Click here to reset