Unsupervised Hashtag Retrieval and Visualization for Crisis Informatics

01/18/2018
by   Yao Gu, et al.
0

In social media like Twitter, hashtags carry a lot of semantic information and can be easily distinguished from the main text. Exploring and visualizing the space of hashtags in a meaningful way can offer important insights into a dataset, especially in crisis situations. In this demonstration paper, we present a functioning prototype, HashViz, that ingests a corpus of tweets collected in the aftermath of a crisis situation (such as the Las Vegas shootings) and uses the fastText bag-of-tricks semantic embedding algorithm (from Facebook Research) to embed words and hashtags into a vector space. Hashtag vectors obtained in this way can be visualized using the t-SNE dimensionality reduction algorithm in 2D. Although multiple Twitter visualization platforms exist, HashViz is distinguished by being simple, scalable, interactive and portable enough to be deployed on a server for million-tweet corpora collected in the aftermath of arbitrary disasters, without special-purpose installation, technical expertise, manual supervision or costly software or infrastructure investment. Although simple, we show that HashViz offers an intuitive way to summarize, and gain insight into, a developing crisis situation. HashViz is also completely unsupervised, requiring no manual inputs to go from a raw corpus to a visualization and search interface. Using the recent Las Vegas mass shooting massacre as a case study, we illustrate the potential of HashViz using only a web browser on the client side.

READ FULL TEXT

page 1

page 2

research
05/19/2016

Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages

Microblogging platforms such as Twitter provide active communication cha...
research
02/28/2016

Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?

The most studied and most successful language models were developed and ...
research
10/31/2017

Doris: A tool for interactive exploration of historic corpora (Extended Version)

Insights into social phenomenon can be gleaned from trends and patterns ...
research
11/11/2017

Discovering conversational topics and emotions associated with Demonetization tweets in India

Social media platforms contain great wealth of information which provide...
research
01/30/2019

Twitter Job/Employment Corpus: A Dataset of Job-Related Discourse Built with Humans in the Loop

We present the Twitter Job/Employment Corpus, a collection of tweets ann...
research
03/02/2020

Cartolabe: A Web-Based Scalable Visualization of Large Document Collections

We describe CARTOLABE, a web-based multi-scale system for visualizing an...

Please sign up or login with your details

Forgot password? Click here to reset