Towards Sub-Word Level Compositions for Sentiment Analysis of Hindi-English Code Mixed Text

11/02/2016
by   Ameya Prabhu, et al.
0

Sentiment analysis (SA) using code-mixed data from social media has several applications in opinion mining ranging from customer satisfaction to social campaign analysis in multilingual societies. Advances in this area are impeded by the lack of a suitable annotated dataset. We introduce a Hindi-English (Hi-En) code-mixed dataset for sentiment analysis and perform empirical analysis comparing the suitability and performance of various state-of-the-art SA methods in social media. In this paper, we introduce learning sub-word level representations in LSTM (Subword-LSTM) architecture instead of character-level or word-level representations. This linguistic prior in our architecture enables us to learn the information about sentiment value of important morphemes. This also seems to work well in highly noisy text containing misspellings as shown in our experiments which is demonstrated in morpheme-level feature maps learned by our model. Also, we hypothesize that encoding this linguistic prior in the Subword-LSTM architecture leads to the superior performance. Our system attains accuracy 4-5 outperforms the available system for sentiment analysis in Hi-En code-mixed text by 18

READ FULL TEXT
research
10/09/2020

gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM Architecture for SENTIment Analysis of Code-MIXed Data

The phenomenon of mixing the vocabulary and syntax of multiple languages...
research
06/12/2018

An Ensemble Model for Sentiment Analysis of Hindi-English Code-Mixed Data

In multilingual societies like India, code-mixed social media texts comp...
research
01/30/2018

Preparation of Improved Turkish DataSet for Sentiment Analysis in Social Media

A public dataset, with a variety of properties suitable for sentiment an...
research
03/27/2020

Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification

Nigerian English adaptation, Pidgin, has evolved over the years through ...
research
03/21/2018

ρ-hot Lexicon Embedding-based Two-level LSTM for Sentiment Analysis

Sentiment analysis is a key component in various text mining application...
research
06/18/2019

Curriculum Learning Strategies for Hindi-English Codemixed Sentiment Analysis

Sentiment Analysis and other semantic tasks are commonly used for social...
research
06/11/2018

Addition of Code Mixed Features to Enhance the Sentiment Prediction of Song Lyrics

Sentiment analysis, also called opinion mining, is the field of study th...

Please sign up or login with your details

Forgot password? Click here to reset