NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yorùbá ) consisting of around 30,000 annotated tweets per language (and 14,000 for Nigerian-Pidgin), including a significant fraction of code-mixed tweets. We propose text collection, filtering, processing and labeling methods that enable us to create datasets for these low-resource languages. We evaluate a rangeof pre-trained models and transfer strategies on the dataset. We find that language-specific models and language-adaptivefine-tuning generally perform best. We release the datasets, trained models, sentiment lexicons, and code to incentivizeresearch on sentiment analysis in under-represented languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2023

HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis

We present the findings of SemEval-2023 Task 12, a shared task on sentim...
research
05/09/2022

A Dataset and BERT-based Models for Targeted Sentiment Analysis on Turkish Texts

Targeted Sentiment Analysis aims to extract sentiment towards a particul...
research
11/03/2020

XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection

We introduce XED, a multilingual fine-grained emotion dataset. The datas...
research
05/06/2021

On the logistical difficulties and findings of Jopara Sentiment Analysis

This paper addresses the problem of sentiment analysis for Jopara, a cod...
research
04/21/2019

UniSent: Universal Adaptable Sentiment Lexica for 1000+ Languages

In this paper, we introduce UniSent a universal sentiment lexica for 100...
research
06/20/2019

A New Statistical Approach for Comparing Algorithms for Lexicon Based Sentiment Analysis

Lexicon based sentiment analysis usually relies on the identification of...
research
08/10/2022

The Moral Foundations Reddit Corpus

Moral framing and sentiment can affect a variety of online and offline b...

Please sign up or login with your details

Forgot password? Click here to reset