Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate Users

04/27/2021
by   Moussa Doumbouya, et al.
0

For many of the 700 million illiterate people around the world, speech recognition technology could provide a bridge to valuable information and services. Yet, those most in need of this technology are often the most underserved by it. In many countries, illiterate people tend to speak only low-resource languages, for which the datasets necessary for speech technology development are scarce. In this paper, we investigate the effectiveness of unsupervised speech representation learning on noisy radio broadcasting archives, which are abundant even in low-resource languages. We make three core contributions. First, we release two datasets to the research community. The first, West African Radio Corpus, contains 142 hours of audio in more than 10 languages with a labeled validation subset. The second, West African Virtual Assistant Speech Recognition Corpus, consists of 10K labeled audio clips in four languages. Next, we share West African wav2vec, a speech encoder trained on the noisy radio corpus, and compare it with the baseline Facebook speech encoder trained on six times more data of higher quality. We show that West African wav2vec performs similarly to the baseline on a multilingual speech recognition task, and significantly outperforms the baseline on a West African language identification task. Finally, we share the first-ever speech recognition models for Maninka, Pular and Susu, languages spoken by a combined 10 million people in over seven countries, including six where the majority of the adult population is illiterate. Our contributions offer a path forward for ethical AI research to serve the needs of those most disadvantaged by the digital divide.

READ FULL TEXT

page 2

page 14

page 19

page 21

page 22

page 23

page 27

page 28

research
06/26/2022

Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi

In this paper we discuss an in-progress work on the development of a spe...
research
07/06/2022

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

With the rise of deep learning and intelligent vehicles, the smart assis...
research
07/12/2022

Huqariq: A Multilingual Speech Corpus of Native Languages of Peru for Speech Recognition

The Huqariq corpus is a multilingual collection of speech from native Pe...
research
05/24/2021

Unsupervised Speech Recognition

Despite rapid progress in the recent past, current speech recognition sy...
research
03/04/2021

Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy

Speech Affect Recognition is a problem of extracting emotional affects f...
research
05/18/2022

Macedonian Speech Synthesis for Assistive Technology Applications

Speech technology is becoming ever more ubiquitous with the advance of s...
research
01/29/2020

Learning Robust and Multilingual Speech Representations

Unsupervised speech representation learning has shown remarkable success...

Please sign up or login with your details

Forgot password? Click here to reset