A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection

05/30/2018
by   Sahil Swami, et al.
0

Social media platforms like twitter and facebook have be- come two of the largest mediums used by people to express their views to- wards different topics. Generation of such large user data has made NLP tasks like sentiment analysis and opinion mining much more important. Using sarcasm in texts on social media has become a popular trend lately. Using sarcasm reverses the meaning and polarity of what is implied by the text which poses challenge for many NLP tasks. The task of sarcasm detection in text is gaining more and more importance for both commer- cial and security services. We present the first English-Hindi code-mixed dataset of tweets marked for presence of sarcasm and irony where each token is also annotated with a language tag. We present a baseline su- pervised classification system developed using the same dataset which achieves an average F-score of 78.4 after using random forest classifier and performing 10-fold cross validation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2018

Automatic Normalization of Word Variations in Code-Mixed Social Media Text

Social media platforms such as Twitter and Facebook are becoming popular...
research
05/30/2018

An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System

Social media has become one of the main channels for peo- ple to communi...
research
03/26/2018

Aggression-annotated Corpus of Hindi-English Code-mixed Data

As the interaction over the web has increased, incidents of aggression a...
research
02/06/2022

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection

The presence of sarcasm in conversational systems and social media like ...
research
10/12/2022

Annotating Norwegian Language Varieties on Twitter for Part-of-Speech

Norwegian Twitter data poses an interesting challenge for Natural Langua...
research
04/11/2020

Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network

Exponential growths of social media and micro-blogging sites not only pr...
research
08/09/2021

FiLMing Multimodal Sarcasm Detection with Attention

Sarcasm detection identifies natural language expressions whose intended...

Please sign up or login with your details

Forgot password? Click here to reset