BCSAT : A Benchmark Corpus for Sentiment Analysis in Telugu Using Word-level Annotations

07/04/2018
by   Sreekavitha Parupalli, et al.
0

The presented work aims at generating a systematically annotated corpus that can support the enhancement of sentiment analysis tasks in Telugu using word-level sentiment annotations. From OntoSenseNet, we extracted 11,000 adjectives, 253 adverbs, 8483 verbs and sentiment annotation is being done by language experts. We discuss the methodology followed for the polarity annotations and validate the developed resource. This work aims at developing a benchmark corpus, as an extension to SentiWordNet, and baseline accuracy for a model where lexeme annotations are applied for sentiment predictions. The fundamental aim of this paper is to validate and study the possibility of utilizing machine learning algorithms, word-level sentiment annotations in the task of automated sentiment identification. Furthermore, accuracy is improved by annotating the bi-grams extracted from the target corpus.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2018

Towards Enhancing Lexical Resource and Using Sense-annotations of OntoSenseNet for Sentiment Analysis

This paper illustrates the interface of the tool we developed for crowd ...
research
12/14/2022

Quotations, Coreference Resolution, and Sentiment Annotations in Croatian News Articles: An Exploratory Study

This paper presents a corpus annotated for the task of direct-speech ext...
research
03/09/2020

A Multi-Source Entity-Level Sentiment Corpus for the Financial Domain: The FinLin Corpus

We introduce FinLin, a novel corpus containing investor reports, company...
research
05/17/2020

LiSSS: A toy corpus of Literary Spanish Sentences Sentiment for Emotions Detection

In this work we present a new and small corpus in the area of Computatio...
research
09/04/2022

Quantitative Stopword Generation for Sentiment Analysis via Recursive and Iterative Deletion

Stopwords carry little semantic information and are often removed from t...
research
05/05/2022

CATs are Fuzzy PETs: A Corpus and Analysis of Potentially Euphemistic Terms

Euphemisms have not received much attention in natural language processi...
research
11/09/2021

A Computational Approach to Walt Whitman's Stylistic Changes in Leaves of Grass

This study analyzes Walt Whitman's stylistic changes in his phenomenal w...

Please sign up or login with your details

Forgot password? Click here to reset