AI4D – African Language Dataset Challenge

07/23/2020
by   Kathleen Siminyu, et al.
0

As language and speech technologies become more advanced, the lack of fundamental digital resources for African languages, such as data, spell checkers and Part of Speech taggers, means that the digital divide between these languages and others keeps growing. This work details the organisation of the AI4D - African Language Dataset Challenge, an effort to incentivize the creation, organization and discovery of African language datasets through a competitive challenge. We particularly encouraged the submission of annotated datasets which can be used for training task-specific supervised machine learning models.

READ FULL TEXT

page 5

page 6

page 7

page 8

research
04/06/2021

AI4D – African Language Program

Advances in speech and language technologies enable tools such as voice-...
research
09/27/2022

Assessing Digital Language Support on a Global Scale

The users of endangered languages struggle to thrive in a digitally-medi...
research
05/08/2023

Augmented Datasheets for Speech Datasets and Ethical Decision-Making

Speech datasets are crucial for training Speech Language Technologies (S...
research
08/24/2022

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

A cornerstone in AI research has been the creation and adoption of stand...
research
12/19/2022

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

We present NusaCrowd, a collaborative initiative to collect and unite ex...
research
03/17/2022

Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages

Recent progress in NLP is driven by pretrained models leveraging massive...
research
06/01/2017

Machine Assisted Analysis of Vowel Length Contrasts in Wolof

Growing digital archives and improving algorithms for automatic analysis...

Please sign up or login with your details

Forgot password? Click here to reset