Moroccan Dialect -Darija- Open Dataset

02/28/2021
by   Aissam Outchakoucht, et al.
0

Darija Open Dataset (DODa) is an open-source project for the Moroccan dialect. With more than 10,000 entries DODa is arguably the largest open-source collaborative project for Darija-English translation built for Natural Language Processing purposes. In fact, besides semantic categorization, DODa also adopts a syntactic one, presents words under different spellings, offers verb-to-noun and masculine-to-feminine correspondences, contains the conjugation of hundreds of verbs in different tenses, and many other subsets to help researchers better understand and study Moroccan dialect. This data paper presents a description of DODa, its features, how it was collected, as well as a first application in Image Classification using ImageNet labels translated to Darija. This collaborative project is hosted on GitHub platform under MIT's Open-Source license and aims to be a standard resource for researchers, students, and anyone who is interested in Moroccan Dialect

READ FULL TEXT
research
01/11/2018

Open source platform Digital Personal Assistant

In our project we introduce open source platform Digital Personal Assist...
research
07/17/2019

LinTO : Assistant vocal open-source respectueux des données personnelles pour les réunions d'entreprise

This paper presents the first results of the PIA "Grands Défis du Numéri...
research
09/02/2019

Open-Source Projects and their Collaborative Development Workflows

For teams using distributed version control systems, the right collabora...
research
07/12/2019

Modularization of Research Software for Collaborative Open Source Development

Software systems evolve over their lifetime. Changing conditions, such a...
research
10/07/2020

Kartta Labs: Collaborative Time Travel

We introduce the modular and scalable design of Kartta Labs, an open sou...
research
05/11/2017

Building a Semantic Role Labelling System for Vietnamese

Semantic role labelling (SRL) is a task in natural language processing w...
research
12/28/2019

An Open-Source Project for MapReduce Performance Self-Tuning

Many Hadoop configuration parameters have significant influence in the p...

Please sign up or login with your details

Forgot password? Click here to reset