Generating Word and Document Embeddings for Sentiment Analysis

01/05/2020
by   Cem Rıfkı Aydın, et al.
0

Sentiments of words differ from one corpus to another. Inducing general sentiment lexicons for languages and using them cannot, in general, produce meaningful results for different domains. In this paper, we combine contextual and supervised information with the general semantic representations of words occurring in the dictionary. Contexts of words help us capture the domain-specific information and supervised scores of words are indicative of the polarities of those words. When we combine supervised features of words with the features extracted from their dictionary definitions, we observe an increase in the success rates. We try out the combinations of contextual, supervised, and dictionary-based approaches, and generate original vectors. We also combine the word2vec approach with hand-crafted features. We induce domain-specific sentimental vectors for two corpora, which are the movie domain and the Twitter datasets in Turkish. When we thereafter generate document vectors and employ the support vector machines method utilising those vectors, our approaches perform better than the baseline studies for Turkish with a significant margin. We evaluated our models on two English corpora as well and these also outperformed the word2vec approach. It shows that our approaches are cross-lingual and cross-domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2016

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora

A word's sentiment depends on the domain in which it is used. Computatio...
research
11/26/2016

Structural Correspondence Learning for Cross-lingual Sentiment Classification with One-to-many Mappings

Structural correspondence learning (SCL) is an effective method for cros...
research
06/01/2020

Hybrid Improved Document-level Embedding (HIDE)

In recent times, word embeddings are taking a significant role in sentim...
research
07/06/2019

Best Practices for Learning Domain-Specific Cross-Lingual Embeddings

Cross-lingual embeddings aim to represent words in multiple languages in...
research
04/01/2017

Sentiment Analysis of Citations Using Word2vec

Citation sentiment analysis is an important task in scientific paper ana...
research
06/14/2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment

Because word semantics can substantially change across communities and c...
research
12/11/2015

Words are not Equal: Graded Weighting Model for building Composite Document Vectors

Despite the success of distributional semantics, composing phrases from ...

Please sign up or login with your details

Forgot password? Click here to reset