Replaying Archived Twitter: When your bird is broken, will it bring you down?
Historians and researchers trust web archives to preserve social media content that no longer exists on the live web. However, what we see on the live web and how it is replayed in the archive are not always the same. In this paper, we document and analyze the problems in archiving Twitter ever since Twitter forced the use of its new UI in June 2020. Most web archives were unable to archive the new UI, resulting in archived Twitter pages displaying Twitter's "Something went wrong" error. The challenges in archiving the new UI forced web archives to continue using the old UI. To analyze the potential loss of information in web archival data due to this change, we used the personal Twitter account of the 45th President of the United States, @realDonaldTrump, which was suspended by Twitter on January 8, 2021. Trump's account was heavily labeled by Twitter for spreading misinformation, however we discovered that there is no evidence in web archives to prove that some of his tweets ever had a label assigned to them. We also studied the possibility of temporal violations in archived versions of the new UI, which may result in the replay of pages that never existed on the live web. Our goal is to educate researchers who may use web archives and caution them when drawing conclusions based on archived Twitter pages.
READ FULL TEXT