DeepAI AI Chat
Log In Sign Up

Multilingual Twitter Sentiment Classification: The Role of Human Annotators

by   Igor Mozetic, et al.
Jozef Stefan Institute

What are the limits of automated Twitter sentiment classification? We analyze a large set of manually labeled tweets in different languages, use them as training data, and construct automated classification models. It turns out that the quality of classification models depends much more on the quality and size of training data than on the type of the model trained. Experimental results indicate that there is no statistically significant difference between the performance of the top classification models. We quantify the quality of training data by applying various annotator agreement measures, and identify the weakest points of different datasets. We show that the model performance approaches the inter-annotator agreement when the size of the training set is sufficiently large. However, it is crucial to regularly monitor the self- and inter-annotator agreements since this improves the training datasets and consequently the model performance. Finally, we show that there is strong evidence that humans perceive the sentiment classes (negative, neutral, and positive) as ordered.


Sentiment of Emojis

There is a new generation of emoticons, called emojis, that is increasin...

How to evaluate sentiment classifiers for Twitter time-ordered data?

Social media are becoming an increasingly important source of informatio...

HashSet – A Dataset For Hashtag Segmentation

Hashtag segmentation is the task of breaking a hashtag into its constitu...

How Will Your Tweet Be Received? Predicting the Sentiment Polarity of Tweet Replies

Twitter sentiment analysis, which often focuses on predicting the polari...

Twitter Sentiment on Affordable Care Act using Score Embedding

In this paper we introduce score embedding, a neural network based model...

Towards A Sentiment Analyzer for Low-Resource Languages

Twitter is one of the top influenced social media which has a million nu...

The effects of data size on Automated Essay Scoring engines

We study the effects of data size and quality on the performance on Auto...