Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora

06/09/2016
by   William L. Hamilton, et al.
0

A word's sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words, achieving state-of-the-art performance competitive with approaches that rely on hand-curated resources. Using our framework we perform two large-scale empirical studies to quantify the extent to which sentiment varies across time and between communities. We induce and release historical sentiment lexicons for 150 years of English and community-specific sentiment lexicons for 250 online communities from the social media forum Reddit. The historical lexicons show that more than 5 sentiment-bearing (non-neutral) English words completely switched polarity during the last 150 years, and the community-specific lexicons highlight how sentiment varies drastically between different communities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment

Because word semantics can substantially change across communities and c...
research
01/05/2020

Generating Word and Document Embeddings for Sentiment Analysis

Sentiments of words differ from one corpus to another. Inducing general ...
research
11/16/2018

Using Sentiment Induction to Understand Variation in Gendered Online Communities

We analyze gendered communities defined in three different ways: text, u...
research
12/16/2020

Building domain specific lexicon based on TikTok comment dataset

In the sentiment analysis task, predicting the sentiment tendency of a s...
research
09/18/2023

The ParlaSent multilingual training dataset for sentiment identification in parliamentary proceedings

Sentiments inherently drive politics. How we receive and process informa...
research
05/11/2018

NRC-Canada at SMM4H Shared Task: Classifying Tweets Mentioning Adverse Drug Reactions and Medication Intake

Our team, NRC-Canada, participated in two shared tasks at the AMIA-2017 ...
research
12/02/2015

Benchmarking sentiment analysis methods for large-scale texts: A case for using continuum-scored words and word shift graphs

The emergence and global adoption of social media has rendered possible ...

Please sign up or login with your details

Forgot password? Click here to reset