A Dataset of State-Censored Tweets

01/15/2021
by   Tuğrulcan Elmas, et al.
0

Many governments impose traditional censorship methods on social media platforms. Instead of removing it completely, many social media companies, including Twitter, only withhold the content from the requesting country. This makes such content still accessible outside of the censored region, allowing for an excellent setting in which to study government censorship on social media. We mine such content using the Internet Archive's Twitter Stream Grab. We release a dataset of 583,437 tweets by 155,715 users that were censored between 2012-2020 July. We also release 4,301 accounts that were censored in their entirety. Additionally, we release a set of 22,083,759 supplemental tweets made up of all tweets by users with at least one censored tweet as well as instances of other users retweeting the censored user. We provide an exploratory analysis of this dataset. Our dataset will not only aid in the study of government censorship but will also aid in studying hate speech detection and the effect of censorship on social media users. The dataset is publicly available at https://doi.org/10.5281/zenodo.4439509

READ FULL TEXT
research
08/31/2021

The emojification of sentiment on social media: Collection and analysis of a longitudinal Twitter sentiment dataset

Social media, as a means for computer-mediated communication, has been e...
research
11/19/2018

A Comparative Analysis of Content-based Geolocation in Blogs and Tweets

The geolocation of online information is an essential component in any g...
research
04/13/2016

Dissecting a Social Botnet: Growth, Content and Influence in Twitter

Social botnets have become an important phenomenon on social media. Ther...
research
09/26/2017

A Longitudinal Assessment of the Persistence of Twitter Datasets

Sharing of social media datasets presents the caveat that they are not a...
research
03/25/2022

Manipulating Twitter Through Deletions

Research into influence campaigns on Twitter has mostly relied on identi...
research
01/31/2019

A large-scale crowdsourced analysis of abuse against women journalists and politicians on Twitter

We report the first, to the best of our knowledge, hand-in-hand collabor...
research
09/28/2019

Attention-based method for categorizing different types of online harassment language

In the era of social media and networking platforms, Twitter has been do...

Please sign up or login with your details

Forgot password? Click here to reset