Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark

06/13/2023
by   Łukasz Augustyniak, et al.
0

Despite impressive advancements in multilingual corpora collection and model training, developing large-scale deployments of multilingual models still presents a significant challenge. This is particularly true for language tasks that are culture-dependent. One such example is the area of multilingual sentiment analysis, where affective markers can be subtle and deeply ensconced in culture. This work presents the most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selected datasets from over 350 datasets reported in the scientific literature based on strict quality criteria. The corpus covers 27 languages representing 6 language families. Datasets can be queried using several linguistic and functional features. In addition, we present a multi-faceted sentiment classification benchmark summarizing hundreds of experiments conducted on different base models, training objectives, dataset collections, and fine-tuning strategies.

READ FULL TEXT

page 6

page 21

research
04/27/2023

UIO at SemEval-2023 Task 12: Multilingual fine-tuning for sentiment classification in low-resource languages

Our contribution to the 2023 AfriSenti-SemEval shared task 12: Sentiment...
research
04/11/2022

Assessment of Massively Multilingual Sentiment Classifiers

Models are increasing in size and complexity in the hunt for SOTA. But w...
research
05/30/2016

Going Deeper for Multilingual Visual Sentiment Detection

This technical report details several improvements to the visual concept...
research
04/26/2023

HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis

We present the findings of SemEval-2023 Task 12, a shared task on sentim...
research
06/01/2023

UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis

This paper describes our system designed for SemEval-2023 Task 12: Senti...
research
08/16/2015

Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

Every culture and language is unique. Our work expressly focuses on the ...
research
06/17/2016

Universal, Unsupervised (Rule-Based), Uncovered Sentiment Analysis

We present a novel unsupervised approach for multilingual sentiment anal...

Please sign up or login with your details

Forgot password? Click here to reset