DeepAI AI Chat
Log In Sign Up

A Pipeline for Post-Crisis Twitter Data Acquisition

01/17/2018
by   Mayank Kejriwal, et al.
USC Information Sciences Institute
University of Southern California
0

Due to instant availability of data on social media platforms like Twitter, and advances in machine learning and data management technology, real-time crisis informatics has emerged as a prolific research area in the last decade. Although several benchmarks are now available, especially on portals like CrisisLex, an important, practical problem that has not been addressed thus far is the rapid acquisition and benchmarking of data from free, publicly available streams like the Twitter API. In this paper, we present ongoing work on a pipeline for facilitating immediate post-crisis data collection, curation and relevance filtering from the Twitter API. The pipeline is minimally supervised, alleviating the need for feature engineering by including a judicious mix of data preprocessing and fast text embeddings, along with an active learning framework. We illustrate the utility of the pipeline by describing a recent case study wherein it was used to collect and analyze millions of tweets in the immediate aftermath of the Las Vegas shootings.

READ FULL TEXT
01/23/2020

The Pushshift Reddit Dataset

Social media data has become crucial to the advancement of scientific un...
03/27/2018

A Web Scraping Methodology for Bypassing Twitter API Restrictions

Retrieving information from social networks is the first and primordial ...
11/27/2020

Post or Tweet: Lessons from a Study of Facebook and Twitter Usage

This workshop paper reports on an ongoing mixed-methods study on the two...
09/22/2022

Active Keyword Selection to Track Evolving Topics on Twitter

How can we study social interactions on evolving topics at a mass scale?...
06/21/2020

Automatic Query Optimization for Retrieving Traffic Tweets

Twitter, like many social media and data brokering companies, makes thei...
03/23/2017

Rapid-Rate: A Framework for Semi-supervised Real-time Sentiment Trend Detection in Unstructured Big Data

Commercial establishments like restaurants, service centres and retailer...