Kashmir: A Computational Analysis of the Voice of Peace

by   Shriphani Palakodety, et al.

The recent Pulwama terror attack (February 14, 2019, Pulwama, Kashmir) triggered a chain of escalating events between India and Pakistan adding another episode to their 70-year-old dispute over Kashmir. The present era of ubiquitious social media has never seen nuclear powers closer to war. In this paper, we analyze this evolving international crisis via a substantial corpus constructed using comments on YouTube videos (921,235 English comments posted by 392,460 users out of 2.04 million overall comments by 791,289 users on 2,890 videos). Our main contributions in the paper are three-fold. First, we present an observation that polyglot word-embeddings reveal precise and accurate language clusters, and subsequently construct a document language-identification technique with negligible annotation requirements. We demonstrate the viability and utility across a variety of data sets involving several low-resource languages. Second, we present an extensive analysis on temporal trends of pro-peace and pro-war intent through a manually constructed polarity phrase lexicon. We observe that when tensions between the two nations were at their peak, pro-peace intent in the corpus was at its highest point. Finally, in the context of heated discussions in a politically tense situation where two nations are at the brink of a full-fledged war, we argue the importance of automatic identification of user-generated web content that can diffuse hostility and address this prediction task, dubbed hope-speech detection.


Voice for the Voiceless: Active Sampling to Detect Comments Supporting the Rohingyas

The Rohingya refugee crisis is one of the biggest humanitarian crises of...

bitsa_nlp@LT-EDI-ACL2022: Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments

Online social networks are ubiquitous and user-friendly. Nevertheless, i...

DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed Text

This paper describes the development of a multilingual, manually annotat...

Developing a Multilingual Annotated Corpus of Misogyny and Aggression

In this paper, we discuss the development of a multilingual annotated co...

Hate, Obscenity, and Insults: Measuring the Exposure of Children to Inappropriate Comments in YouTube

Social media has become an essential part of the daily routines of child...

We Don't Speak the Same Language: Interpreting Polarization through Machine Translation

Polarization among US political parties, media and elites is a widely st...

Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality

As a contribution to personality detection in languages other than Engli...

Please sign up or login with your details

Forgot password? Click here to reset