Code-Mixed Sentiment Analysis Using Machine Learning and Neural Network Approaches

08/09/2018
by   Pruthwik Mishra, et al.
0

Sentiment Analysis for Indian Languages (SAIL)-Code Mixed tools contest aimed at identifying the sentence level sentiment polarity of the code-mixed dataset of Indian languages pairs (Hi-En, Ben-Hi-En). Hi-En dataset is henceforth referred to as HI-EN and Ben-Hi-En dataset as BN-EN respectively. For this, we submitted four models for sentiment analysis of code-mixed HI-EN and BN-EN datasets. The first model was an ensemble voting classifier consisting of three classifiers - linear SVM, logistic regression and random forests while the second one was a linear SVM. Both the models used TF-IDF feature vectors of character n-grams where n ranged from 2 to 6. We used scikit-learn (sklearn) machine learning library for implementing both the approaches. Run1 was obtained from the voting classifier and Run2 used the linear SVM model for producing the results. Out of the four submitted outputs Run2 outperformed Run1 in both the datasets. We finished first in the contest for both HI-EN with an F-score of 0.569 and BN-EN with an F-score of 0.526.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2021

Sentiment Analysis of Code-Mixed Social Media Text (Hinglish)

This paper discusses the results obtained for different techniques appli...
research
04/07/2017

NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis

This paper describes our multi-view ensemble approach to SemEval-2017 Ta...
research
04/03/2018

Sentiment Analysis of Code-Mixed Languages leveraging Resource Rich Languages

Code-mixed data is an important challenge of natural language processing...
research
01/22/2021

CMSAOne@Dravidian-CodeMix-FIRE2020: A Meta Embedding and Transformer model for Code-Mixed Sentiment Analysis on Social Media Text

Code-mixing(CM) is a frequently observed phenomenon that uses multiple l...
research
09/07/2021

ExCode-Mixed: Explainable Approaches towards Sentiment Analysis on Code-Mixed Data using BERT models

The increasing use of social media sites in countries like India has giv...
research
01/09/2019

Sentiment Analysis of Czech Texts: An Algorithmic Survey

In the area of online communication, commerce and transactions, analyzin...
research
05/13/2020

Phishing URL Detection Through Top-level Domain Analysis: A Descriptive Approach

Phishing is considered to be one of the most prevalent cyber-attacks bec...

Please sign up or login with your details

Forgot password? Click here to reset