UCAS-IIE-NLP at SemEval-2023 Task 12: Enhancing Generalization of Multilingual BERT for Low-resource Sentiment Analysis

06/01/2023
by   Dou Hu, et al.
0

This paper describes our system designed for SemEval-2023 Task 12: Sentiment analysis for African languages. The challenge faced by this task is the scarcity of labeled data and linguistic resources in low-resource settings. To alleviate these, we propose a generalized multilingual system SACL-XLMR for sentiment analysis on low-resource languages. Specifically, we design a lexicon-based multilingual BERT to facilitate language adaptation and sentiment-aware representation learning. Besides, we apply a supervised adversarial contrastive learning technique to learn sentiment-spread structured representations and enhance model generalization. Our system achieved competitive results, largely outperforming baselines on both multilingual and zero-shot sentiment classification subtasks. Notably, the system obtained the 1st rank on the zero-shot classification subtask in the official ranking. Extensive experiments demonstrate the effectiveness of our system.

READ FULL TEXT
research
04/26/2023

HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis

We present the findings of SemEval-2023 Task 12, a shared task on sentim...
research
12/06/2021

Zero-shot hashtag segmentation for multilingual sentiment analysis

Hashtag segmentation, also known as hashtag decomposition, is a common s...
research
06/07/2018

Semi-supervised and Transfer learning approaches for low resource sentiment classification

Sentiment classification involves quantifying the affective reaction of ...
research
05/04/2023

DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning

In recent years, sentiment analysis has gained significant importance in...
research
02/27/2022

Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation

In this paper, we propose a variational autoencoder with disentanglement...
research
04/22/2022

A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning

Subword tokenization is a commonly used input pre-processing step in mos...
research
06/13/2023

Massively Multilingual Corpus of Sentiment Datasets and Multi-faceted Sentiment Classification Benchmark

Despite impressive advancements in multilingual corpora collection and m...

Please sign up or login with your details

Forgot password? Click here to reset