AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages

03/22/2023
by   Chris Chinenye Emezue, et al.
11

The advancement of speech technologies has been remarkable, yet its integration with African languages remains limited due to the scarcity of African speech corpora. To address this issue, we present AfroDigits, a minimalist, community-driven dataset of spoken digits for African languages, currently covering 38 African languages. As a demonstration of the practical applications of AfroDigits, we conduct audio digit classification experiments on six African languages [Igbo (ibo), Yoruba (yor), Rundi (run), Oshiwambo (kua), Shona (sna), and Oromo (gax)] using the Wav2Vec2.0-Large and XLS-R models. Our experiments reveal a useful insight on the effect of mixing African speech corpora during finetuning. AfroDigits is the first published audio digit dataset for African languages and we believe it will, among other things, pave the way for Afro-centric speech applications such as the recognition of telephone numbers, and street numbers. We release the dataset and platform publicly at https://huggingface.co/datasets/chrisjay/crowd-speech-africa and https://huggingface.co/spaces/chrisjay/afro-speech respectively.

READ FULL TEXT

page 3

page 15

page 16

page 17

page 18

page 19

research
09/06/2023

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Dialect identification is a critical task in speech processing and langu...
research
02/16/2022

ADIMA: Abuse Detection In Multilingual Audio

Abusive content detection in spoken text can be addressed by performing ...
research
03/27/2019

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

We describe our development of CSS10, a collection of single speaker spe...
research
05/18/2018

Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Recent research has shown that word embedding spaces learned from text c...
research
06/02/2023

Efficient Spoken Language Recognition via Multilabel Classification

Spoken language recognition (SLR) is the task of automatically identifyi...
research
08/13/2020

LSTM Acoustic Models Learn to Align and Pronounce with Graphemes

Automated speech recognition coverage of the world's languages continues...
research
08/07/2023

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet

This paper presents a state-of-the-art model for transcribing speech in ...

Please sign up or login with your details

Forgot password? Click here to reset