The Memento Tracer Framework: Balancing Quality and Scalability for Web Archiving

09/10/2019
by   Martin Klein, et al.
0

Web archiving frameworks are commonly assessed by the quality of their archival records and by their ability to operate at scale. The ubiquity of dynamic web content poses a significant challenge for crawler-based solutions such as the Internet Archive that are optimized for scale. Human driven services such as the Webrecorder tool provide high-quality archival captures but are not optimized to operate at scale. We introduce the Memento Tracer framework that aims to balance archival quality and scalability. We outline its concept and architecture and evaluate its archival quality and operation at scale. Our findings indicate quality is on par or better compared against established archiving frameworks and operation at scale comes with a manageable overhead.

READ FULL TEXT

page 9

page 10

research
04/03/2023

The Rise of Disappearing Frameworks in Web Development

The evolution of the web can be characterized as an emergence of framewo...
research
03/16/2019

Pythia: a Framework for the Automated Analysis of Web Hosting Environments

A common approach when setting up a website is to utilize third party We...
research
10/01/2017

Pengaruh Perangkat Server Terhadap Kualitas Pengontrolan Jarak Jauh Melalui Internet

Internet greatly assist people in improving their quality of life. Almos...
research
11/10/2022

Ultraverse: Efficient Retroactive Operation for Attack Recovery in Database Systems and Web Frameworks

Retroactive operation is an operation that changes a past operation in a...
research
03/05/2016

A Linked Data Scalability Challenge: Concept Reuse Leads to Semantic Decay

The increasing amount of available Linked Data resources is laying the f...
research
03/27/2018

Characterizing a Meta-CDN

CDNs have reshaped the Internet architecture at large. They operate (glo...

Please sign up or login with your details

Forgot password? Click here to reset