CS-Embed-francesita at SemEval-2020 Task 9: The effectiveness of code-switched word embeddings for sentiment analysis

The growing popularity and applications of sentiment analysis of social media posts has naturally led to sentiment analysis of posts written in multiple languages, a practice known as code-switching. While recent research into code-switched posts has focused on the use of multilingual word embeddings, these embeddings were not trained on code-switched data. In this work, we present word-embeddings trained on code-switched tweets, specifically those that make use of Spanish and English, known as Spanglish. We explore the embedding space to discover how they capture the meanings of words in both languages. We test the effectiveness of these embeddings by participating in SemEval 2020 Task 9:  Sentiment Analysis on Code-Mixed Social Media Text. We utilising them to train a sentiment classifier that achieves an F-1 score of 0.722. This is higher than the baseline for the competition of 0.656, and our team ranks 14 out of 23 participating teams beating the baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

LT3 at SemEval-2020 Task 9: Cross-lingual Embeddings for Sentiment Analysis of Hinglish Social Media Text

This paper describes our contribution to the SemEval-2020 Task 9 on Sent...
research
10/09/2017

Deep Learning Paradigm with Transformed Monolingual Word Embeddings for Multilingual Sentiment Analysis

The surge of social media use brings huge demand of multilingual sentime...
research
04/17/2023

New Product Development (NPD) through Social Media-based Analysis by Comparing Word2Vec and BERT Word Embeddings

This study introduces novel methods for sentiment and opinion classifica...
research
03/27/2021

Unsupervised Self-Training for Sentiment Analysis of Code-Switched Data

Sentiment analysis is an important task in understanding social media co...
research
10/26/2022

Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages

In the process of numerically modeling natural languages, developing lan...
research
10/27/2016

Word Embeddings to Enhance Twitter Gang Member Profile Identification

Gang affiliates have joined the masses who use social media to share tho...
research
12/31/2016

Expanding Subjective Lexicons for Social Media Mining with Embedding Subspaces

Recent approaches for sentiment lexicon induction have capitalized on pr...

Please sign up or login with your details

Forgot password? Click here to reset