HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis

We present the findings of SemEval-2023 Task 12, a shared task on sentiment analysis for low-resource African languages using Twitter dataset. The task featured three subtasks; subtask A is monolingual sentiment classification with 12 tracks which are all monolingual languages, subtask B is multilingual sentiment classification using the tracks in subtask A and subtask C is a zero-shot sentiment classification. We present the results and findings of subtask A, subtask B and subtask C. We also release the code on github. Our goal is to leverage low-resource tweet data using pre-trained Afro-xlmr-large, AfriBERTa-Large, Bert-base-arabic-camelbert-da-sentiment (Arabic-camelbert), Multilingual-BERT (mBERT) and BERT models for sentiment analysis of 14 African languages. The datasets for these subtasks consists of a gold standard multi-class labeled Twitter datasets from these languages. Our results demonstrate that Afro-xlmr-large model performed better compared to the other models in most of the languages datasets. Similarly, Nigerian languages: Hausa, Igbo, and Yoruba achieved better performance compared to other languages and this can be attributed to the higher volume of data present in the languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2023

UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis

This paper describes our system designed for SemEval-2023 Task 12: Senti...
research
04/13/2023

SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)

We present the first Africentric SemEval Shared task, Sentiment Analysis...
research
01/20/2022

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Sentiment analysis is one of the most widely studied applications in NLP...
research
06/24/2023

L3Cube-MahaSent-MD: A Multi-domain Marathi Sentiment Analysis Dataset and Transformer Models

The exploration of sentiment analysis in low-resource languages, such as...
research
02/17/2023

AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages

Africa is home to over 2000 languages from over six language families an...
research
06/13/2023

Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark

Despite impressive advancements in multilingual corpora collection and m...

Please sign up or login with your details

Forgot password? Click here to reset