UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social Media Texts using Transformers and Multi-Task Learning

09/06/2020
by   George-Eduard Zaharia, et al.
0

Sentiment analysis is a process widely used in opinion mining campaigns conducted today. This phenomenon presents applications in a variety of fields, especially in collecting information related to the attitude or satisfaction of users concerning a particular subject. However, the task of managing such a process becomes noticeably more difficult when it is applied in cultures that tend to combine two languages in order to express ideas and thoughts. By interleaving words from two languages, the user can express with ease, but at the cost of making the text far less intelligible for those who are not familiar with this technique, but also for standard opinion mining algorithms. In this paper, we describe the systems developed by our team for SemEval-2020 Task 9 that aims to cover two well-known code-mixed languages: Hindi-English and Spanish-English. We intend to solve this issue by introducing a solution that takes advantage of several neural network approaches, as well as pre-trained word embeddings. Our approach (multlingual BERT) achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850, registered on the competition leaderboard, ranking our team 16th out of 62 participants. For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants by using another multilingual Transformer-based model, XLM-RoBERTa.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2020

WESSA at SemEval-2020 Task 9: Code-Mixed Sentiment Analysis using Transformers

In this paper, we describe our system submitted for SemEval 2020 Task 9,...
research
07/24/2020

JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed data

Code-mixing is a phenomenon which arises mainly in multilingual societie...
research
10/21/2020

LT3 at SemEval-2020 Task 9: Cross-lingual Embeddings for Sentiment Analysis of Hinglish Social Media Text

This paper describes our contribution to the SemEval-2020 Task 9 on Sent...
research
02/18/2022

AMS_ADRN at SemEval-2022 Task 5: A Suitable Image-text Multimodal Joint Modeling Method for Multi-task Misogyny Identification

Women are influential online, especially in image-based social media suc...
research
02/15/2022

BLUE at Memotion 2.0 2022: You have my Image, my Text and my Transformer

Memes are prevalent on the internet and continue to grow and evolve alon...
research
04/13/2022

IIITDWD-ShankarB@ Dravidian-CodeMixi-HASOC2021: mBERT based model for identification of offensive content in south Indian languages

In recent years, there has been a lot of focus on offensive content. The...
research
09/06/2020

UPB at SemEval-2020 Task 8: Joint Textual and Visual Modeling in a Multi-Task Learning Architecture for Memotion Analysis

Users from the online environment can create different ways of expressin...

Please sign up or login with your details

Forgot password? Click here to reset