Chunk Content is not Enough: Chunk-Context Aware Resemblance Detection for Deduplication Delta Compression

06/02/2021
by   Xuming Ye, et al.
0

With the growing popularity of cloud storage, removing duplicated data across users is getting more critical for service providers to reduce costs. Recently, Data resemblance detection is a novel technology to detect redundancy among similarity. It extracts feature from each chunk content and treat chunks with high similarity as candidates for removing redundancy. However, popular resemblance methods such as "N-transform" and "Finesse" use only the chunk data for feature extraction. A minor modification on the data chunk could seriously deteriorate its capability for resemblance detection. In this paper, we proposes a novel chunk-context aware resemblance detection algorithm, called CARD, to mitigate this issue. CARD introduces a BP-Neural network-based chunk-context aware model, and uses N-sub-chunk shingles-based initial feature extraction strategy. It effectively integrates each data chunk content's internal structure with the context information for feature extraction, the impact of small changes in data chunks is significantly reduced. To evaluate its performance, we implement a CARD prototype and conduct extensive experiments using real-world data sets. The results show that CARD can detect up to 75.03 operations by 5.6 to 17.8 times faster compared with the state-of-the-art resemblance detection approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2021

Toward Sensor and Software Product Line Based Context Aware Cloud Environment Assignment

Because of the growing interest for mobile device and pervasive applicat...
research
11/11/2022

Interactive Context-Aware Network for RGB-T Salient Object Detection

Salient object detection (SOD) focuses on distinguishing the most conspi...
research
12/04/2020

Global Context Aware RCNN for Object Detection

RoIPool/RoIAlign is an indispensable process for the typical two-stage o...
research
03/07/2023

Face: Fast, Accurate and Context-Aware Audio Annotation and Classification

This paper presents a context-aware framework for feature selection and ...
research
10/24/2021

CoVA: Context-aware Visual Attention for Webpage Information Extraction

Webpage information extraction (WIE) is an important step to create know...
research
04/19/2022

Enhancing CTR Prediction with Context-Aware Feature Representation Learning

CTR prediction has been widely used in the real world. Many methods mode...
research
07/12/2023

UGCANet: A Unified Global Context-Aware Transformer-based Network with Feature Alignment for Endoscopic Image Analysis

Gastrointestinal endoscopy is a medical procedure that utilizes a flexib...

Please sign up or login with your details

Forgot password? Click here to reset