Learnings from Technological Interventions in a Low Resource Language: A Case-Study on Gondi

04/21/2020
by   Devansh Mehta, et al.
0

The primary obstacle to developing technologies for low-resource languages is the lack of usable data. In this paper, we report the adoption and deployment of 4 technology-driven methods of data collection for Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. In the process of data collection, we also help in its revival by expanding access to information in Gondi through the creation of linguistic resources that can be used by the community, such as a dictionary, children's stories, an app with Gondi content from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform. At the end of these interventions, we collected a little less than 12,000 translated words and/or sentences and identified more than 650 community members whose help can be solicited for future translation efforts. The larger goal of the project is collecting enough data in Gondi to build and deploy viable language technologies like machine translation and speech to text systems that can help take the language onto the internet.

READ FULL TEXT

page 1

page 4

research
11/29/2022

Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi

The primary obstacle to developing technologies for low-resource languag...
research
04/12/2022

Not always about you: Prioritizing community needs when developing endangered language technology

Languages are classified as low-resource when they lack the quantity of ...
research
11/05/2020

Measuring Data Collection Quality for Community Healthcare

Machine learning has tremendous potential to provide targeted interventi...
research
08/25/2022

Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks

Indigenous African languages are categorized as under-served in Artifici...
research
05/31/2023

Ethical Considerations for Machine Translation of Indigenous Languages: Giving a Voice to the Speakers

In recent years machine translation has become very successful for high-...
research
04/06/2021

AI4D – African Language Program

Advances in speech and language technologies enable tools such as voice-...
research
05/01/2019

A system for the 2019 Sentiment, Emotion and Cognitive State Task of DARPAs LORELEI project

During the course of a Humanitarian Assistance-Disaster Relief (HADR) cr...

Please sign up or login with your details

Forgot password? Click here to reset