DeepAI AI Chat
Log In Sign Up

AI4D – African Language Program

by   Kathleen Siminyu, et al.

Advances in speech and language technologies enable tools such as voice-search, text-to-speech, speech recognition and machine translation. These are however only available for high resource languages like English, French or Chinese. Without foundational digital resources for African languages, which are considered low-resource in the digital context, these advanced tools remain out of reach. This work details the AI4D - African Language Program, a 3-part project that 1) incentivised the crowd-sourcing, collection and curation of language datasets through an online quantitative and qualitative challenge, 2) supported research fellows for a period of 3-4 months to create datasets annotated for NLP tasks, and 3) hosted competitive Machine Learning challenges on the basis of these datasets. Key outcomes of the work so far include 1) the creation of 9+ open source, African language datasets annotated for a variety of ML tasks, and 2) the creation of baseline models for these datasets through hosting of competitive ML challenges.


page 1

page 2

page 3

page 4


AI4D – African Language Dataset Challenge

As language and speech technologies become more advanced, the lack of fu...

Vakyansh: ASR Toolkit for Low Resource Indic languages

We present Vakyansh, an end to end toolkit for Speech Recognition in Ind...

A Collaborative Ecosystem for Digital Coptic Studies

Scholarship on underresourced languages bring with them a variety of cha...

Language Technology Programme for Icelandic 2019-2023

In this paper, we describe a new national language technology programme ...

The JHU Speech LOREHLT 2017 System: Cross-Language Transfer for Situation-Frame Detection

We describe the system our team used during NIST's LoReHLT (Low Resource...

Learnings from Technological Interventions in a Low Resource Language: A Case-Study on Gondi

The primary obstacle to developing technologies for low-resource languag...