The Impact of Data Persistence Bias on Social Media Studies

03/02/2023
by   Tuğrulcan Elmas, et al.
0

Social media studies often collect data retrospectively to analyze public opinion. Social media data may decay over time and such decay may prevent the collection of the complete dataset. As a result, the collected dataset may differ from the complete dataset and the study may suffer from data persistence bias. Past research suggests that the datasets collected retrospectively are largely representative of the original dataset in terms of textual content. However, no study analyzed the impact of data persistence bias on social media studies such as those focusing on controversial topics. In this study, we analyze the data persistence and the bias it introduces on the datasets of three types: controversial topics, trending topics, and framing of issues. We report which topics are more likely to suffer from data persistence among these datasets. We quantify the data persistence bias using the change in political orientation, the presence of potentially harmful content and topics as measures. We found that controversial datasets are more likely to suffer from data persistence and they lean towards the political left upon recollection. The turnout of the data that contain potentially harmful content is significantly lower on non-controversial datasets. Overall, we found that the topics promoted by right-aligned users are more likely to suffer from data persistence. Account suspensions are the primary factor contributing to data removals, if not the only one. Our results emphasize the importance of accounting for the data persistence bias by collecting the data in real time when the dataset employed is vulnerable to data persistence bias.

READ FULL TEXT

page 4

page 5

research
12/23/2018

Characterizing Long-Running Political Phenomena on Social Media

Social media provides many opportunities to monitor and evaluate politic...
research
09/26/2017

A Longitudinal Assessment of the Persistence of Twitter Datasets

Sharing of social media datasets presents the caveat that they are not a...
research
09/18/2023

Understanding Divergent Framing of the Supreme Court Controversies: Social Media vs. News Outlets

Understanding the framing of political issues is of paramount importance...
research
10/14/2019

Going Negative Online? – A Study of Negative Advertising on Social Media

A growing number of empirical studies suggest that negative advertising ...
research
07/02/2019

Predicting the Topical Stance of Media and Popular Twitter Users

Controversial social and political issues of the day spur people to expr...
research
07/25/2022

On the Relation Between Opinion Change and Information Consumption on Reddit

While much attention has been devoted to the causes of opinion change, l...
research
10/20/2022

Self-Censorship Under Law: A Case Study of The Hong Kong National Security Law

We study how aggressive legislation can increase self-censorship and alt...

Please sign up or login with your details

Forgot password? Click here to reset