Where Did the Web Archive Go?

08/12/2021
by   Mohamed Aturban, et al.
0

To perform a longitudinal investigation of web archives and detecting variations and changes replaying individual archived pages, or mementos, we created a sample of 16,627 mementos from 17 public web archives. Over the course of our 14-month study (November, 2017 - January, 2019), we found that four web archives changed their base URIs and did not leave a machine-readable method of locating their new base URIs, necessitating manual rediscovery. Of the 1,981 mementos in our sample from these four web archives, 537 were impacted: 517 mementos were rediscovered but with changes in their time of archiving (or Memento-Datetime), HTTP status code, or the string comprising their original URI (or URI-R), and 20 of the mementos could not be found at all.

READ FULL TEXT

page 8

page 12

page 14

research
05/09/2019

Collecting 16K archived web pages from 17 public web archives

We document the creation of a data set of 16,627 archived web pages, or ...
research
05/29/2019

MementoMap Framework for Flexible and Adaptive Web Archive Profiling

In this work we propose MementoMap, a flexible and adaptive framework to...
research
03/25/2011

From Linked Data to Relevant Data -- Time is the Essence

The Semantic Web initiative puts emphasis not primarily on putting data ...
research
03/23/2018

Fully Automated HTML and Javascript Rewriting for Constructing a Self-healing Web Proxy

Over the last few years, the complexity of web applications has increase...
research
04/28/2021

What Did It Look Like: A service for creating website timelapses using the Memento framework

Popular web pages are archived frequently, which makes it difficult to v...
research
06/19/2018

You, the Web and Your Device: Longitudinal Characterization of Browsing Habits

Understanding how people interact with the web is key for a variety of a...
research
04/28/2020

A Retrospective Analysis of User Exposure to (Illicit) Cryptocurrency Mining on the Web

In late 2017, a sudden proliferation of malicious JavaScript was reporte...

Please sign up or login with your details

Forgot password? Click here to reset