Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

06/07/2023
by   Claytone Sikasote, et al.
0

This work introduces Zambezi Voice, an open-source multilingual speech resource for Zambian languages. It contains two collections of datasets: unlabelled audio recordings of radio news and talk shows programs (160 hours) and labelled data (over 80 hours) consisting of read speech recorded from text sourced from publicly available literature books. The dataset is created for speech recognition but can be extended to multilingual speech processing research for both supervised and unsupervised learning approaches. To our knowledge, this is the first multilingual speech dataset created for Zambian languages. We exploit pretraining and cross-lingual transfer learning by finetuning the Wav2Vec2.0 large-scale multilingual pre-trained model to build end-to-end (E2E) speech recognition models for our baseline models. The dataset is released publicly under a Creative Commons BY-NC-ND 4.0 license and can be accessed via https://github.com/unza-speech-lab/zambezi-voice .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2020

MLS: A Large-Scale Multilingual Dataset for Speech Research

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large mu...
research
01/02/2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

We introduce VoxPopuli, a large-scale multilingual corpus providing 100K...
research
01/20/2021

VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care

This paper introduces a large-scale Korean speech dataset, called VOTE40...
research
06/16/2023

CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

In this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lin...
research
05/24/2022

Adaptive multilingual speech recognition with pretrained models

Multilingual speech recognition with supervised learning has achieved gr...
research
03/23/2023

SwissBERT: The Multilingual Language Model for Switzerland

We present SwissBERT, a masked language model created specifically for p...
research
02/25/2020

A.I. based Embedded Speech to Text Using Deepspeech

Deepspeech was very useful for development IoT devices that need voice r...

Please sign up or login with your details

Forgot password? Click here to reset