Efficient Hybrid Inline and Out-of-line Deduplication for Backup Storage

05/22/2014
by   Yan Kit Li, et al.
0

Backup storage systems often remove redundancy across backups via inline deduplication, which works by referring duplicate chunks of the latest backup to those of existing backups. However, inline deduplication degrades restore performance of the latest backup due to fragmentation, and complicates deletion of ex- pired backups due to the sharing of data chunks. While out-of-line deduplication addresses the problems by forward-pointing existing duplicate chunks to those of the latest backup, it introduces additional I/Os of writing and removing duplicate chunks. We design and implement RevDedup, an efficient hybrid inline and out-of-line deduplication system for backup storage. It applies coarse-grained inline deduplication to remove duplicates of the latest backup, and then fine-grained out-of-line reverse deduplication to remove duplicates from older backups. Our reverse deduplication design limits the I/O overhead and prepares for efficient deletion of expired backups. Through extensive testbed experiments using synthetic and real-world datasets, we show that RevDedup can bring high performance to the backup, restore, and deletion operations, while maintaining high storage efficiency comparable to conventional inline deduplication.

READ FULL TEXT
research
02/04/2013

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups

Scaling up the backup storage for an ever-increasing volume of virtual m...
research
05/03/2022

ATDD: Fine-Grained Assured Time-Sensitive Data Deletion Scheme in Cloud Storage

With the rapid development of general cloud services, more and more indi...
research
06/28/2020

A Polynomial Kernel for Line Graph Deletion

The line graph of a graph G is the graph L(G) whose vertex set is the ed...
research
10/17/2020

Aggregated Deletion Propagation for Counting Conjunctive Query Answers

We investigate the computational complexity of minimizing the source sid...
research
04/20/2020

Vilamb: Low Overhead Asynchronous Redundancy for Direct Access NVM

Vilamb provides efficient asynchronous systemredundancy for direct acces...
research
08/22/2023

Minwise-Independent Permutations with Insertion and Deletion of Features

In their seminal work, Broder et. al. <cit.> introduces the minHash algo...

Please sign up or login with your details

Forgot password? Click here to reset