DeepAI AI Chat
Log In Sign Up

Replaying Archived Twitter: When your bird is broken, will it bring you down?

by   Kritika Garg, et al.
Old Dominion University
Internet Archive

Historians and researchers trust web archives to preserve social media content that no longer exists on the live web. However, what we see on the live web and how it is replayed in the archive are not always the same. In this paper, we document and analyze the problems in archiving Twitter ever since Twitter forced the use of its new UI in June 2020. Most web archives were unable to archive the new UI, resulting in archived Twitter pages displaying Twitter's "Something went wrong" error. The challenges in archiving the new UI forced web archives to continue using the old UI. To analyze the potential loss of information in web archival data due to this change, we used the personal Twitter account of the 45th President of the United States, @realDonaldTrump, which was suspended by Twitter on January 8, 2021. Trump's account was heavily labeled by Twitter for spreading misinformation, however we discovered that there is no evidence in web archives to prove that some of his tweets ever had a label assigned to them. We also studied the possibility of temporal violations in archived versions of the new UI, which may result in the replay of pages that never existed on the live web. Our goal is to educate researchers who may use web archives and caution them when drawing conclusions based on archived Twitter pages.


page 3

page 5

page 7


Understanding Web Archiving Services and Their (Mis)Use on Social Media

Either by ensuring the continuing availability of information, or by del...

Collecting 16K archived web pages from 17 public web archives

We document the creation of a data set of 16,627 archived web pages, or ...

A Comparative Analysis of Social Network Pages by Interests of Their Followers

Being a matter of cognition, user interests should be apt to classificat...

Impact of HTTP Cookie Violations in Web Archives

Certain HTTP Cookies on certain sites can be a source of content bias in...

Adoption of Twitter's New Length Limit: Is 280 the New 140?

In November 2017, Twitter doubled the maximum allowed tweet length from ...

A Framework for Aggregating Private and Public Web Archives

Personal and private Web archives are proliferating due to the increase ...

Can Common Crawl reliably track persistent identifier (PID) use over time?

We report here on the results of two studies using two and four monthly ...