ForkBase: Immutable, Tamper-evident Storage Substrate for Branchable Applications

04/16/2020
by   Qian Lin, et al.
0

Data collaboration activities typically require systematic or protocol-based coordination to be scalable. Git, an effective enabler for collaborative coding, has been attested for its success in countless projects around the world. Hence, applying the Git philosophy to general data collaboration beyond coding is motivating. We call it Git for data. However, the original Git design handles data at the file granule, which is considered too coarse-grained for many database applications. We argue that Git for data should be co-designed with database systems. To this end, we developed ForkBase to make Git for data practical. ForkBase is a distributed, immutable storage system designed for data version management and data collaborative operation. In this demonstration, we show how ForkBase can greatly facilitate collaborative data management and how its novel data deduplication technique can improve storage efficiency for archiving massive data versions.

READ FULL TEXT

page 1

page 4

research
02/14/2018

ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications

Existing data storage systems offer a wide range of functionalities to a...
research
10/05/2021

Version Reconciliation for Collaborative Databases

We propose MindPalace, a prototype of a versioned database for efficient...
research
02/25/2023

TS-Cabinet: Hierarchical Storage for Cloud-Edge-End Time-series Database

Hierarchical data storage is crucial for cloud-edge-end time-series data...
research
05/09/2018

Decentralized Collaborative Knowledge Management using Git

The World Wide Web and the Semantic Web are designed as a network of dis...
research
10/02/2018

Harnessing Correlations in Distributed Erasure Coded Key-Value Stores

Motivated by applications of distributed storage systems to cloud-based ...
research
07/03/2017

Version 0.1 of the BigDAWG Polystore System

A polystore system is a database management system (DBMS) composed of in...
research
04/11/2023

An Empirical Evaluation of Columnar Storage Formats

Columnar storage is one of the core components of a modern data analytic...

Please sign up or login with your details

Forgot password? Click here to reset