Despite the progress we have recorded in the last few years in multiling...
In this paper, we create YORC: a new multi-choice Yoruba Reading
Compreh...
We introduce the ÌròyìnSpeech corpus – a new dataset influenced
by a des...
Pretrained language models (PLMs) are today the primary model for natura...
Africa has over 2000 indigenous languages but they are under-represented...
African languages are severely under-represented in NLP research due to ...
We present the first Africentric SemEval Shared task, Sentiment Analysis...
Africa is home to over 2000 languages from over six language families an...
The BLOOM model is a large open-source multilingual language model capab...
BibleTTS is a large, high-quality, open speech dataset for ten languages...
Transferring knowledge from one domain to another is of practical import...
For high-resource languages like English, text classification is a
well-...
Learning semantically meaningful sentence embeddings is an open problem ...
A movie that is thoroughly enjoyed and recommended by an individual migh...
Incorrect labels in training data occur when human annotators make mista...
Multilingual pre-trained language models (PLMs) have demonstrated impres...
What can pre-trained multilingual sequence-to-sequence models like mBART...
Sentiment analysis is one of the most widely studied applications in NLP...
Documents as short as a single sentence may inadvertently reveal sensiti...
We take a step towards addressing the under-representation of the Africa...
Machine Learning approaches to Natural Language Processing tasks benefit...
Differentially private stochastic gradient descent (DPSGD) is a variatio...
The lack of labeled training data has limited the development of natural...
West African Pidgin English is a language that is significantly spoken i...
Advanced neural language models (NLMs) are widely used in sequence gener...