CTDGM: A Data Grouping Model Based on Cache Transaction for Unstructured Data Storage Systems

09/30/2020
by   Dongjie Zhu, et al.
0

Cache prefetching technology has become the mainstream data access optimization strategy in the data centers. However, the rapidly increasing of unstructured data generates massive pairwise access relationships, which can result in a heavy computational burden for the existing prefetching model and lead to severe degradation in the performance of data access. We propose cache-transaction-based data grouping model (CTDGM) to solve the problems described above by optimizing the feature representation method and grouping efficiency. First, we provide the definition of the cache transaction and propose the method for extracting the cache transaction feature (CTF). Second, we design a data chunking algorithm based on CTF and spatiotemporal locality to optimize the relationship calculation efficiency. Third, we propose CTDGM by constructing a relation graph that groups data into independent groups according to the strength of the data access relation. Based on the results of the experiment, compared with the state-of-the-art methods, our algorithm achieves an average increase in the cache hit rate of 12 with small cache size (0.001 number of data I/O accesses by 50 all the data.

READ FULL TEXT
research
11/12/2017

Strongly Secure and Efficient Data Shuffle On Hardware Enclaves

Mitigating memory-access attacks on the Intel SGX architecture is an imp...
research
03/14/2019

Architecture-Aware, High Performance Transaction for Persistent Memory

Byte-addressable non-volatile main memory (NVM) demands transactional me...
research
04/28/2021

FaaT: A Transparent Auto-Scaling Cache for Serverless Applications

Function-as-a-Service (FaaS) has become an increasingly popular way for ...
research
05/11/2022

Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches

The XRootD system is used to transfer, store, and cache large datasets f...
research
09/10/2021

A Fast-and-Effective Early-Stage Multi-level Cache Optimization Method Based on Reuse-Distance Analysis

In this paper, we propose a practical and effective approach allowing de...
research
01/21/2020

Caching at Base Stations with Multi-Cluster Multicast Wireless Backhaul via Accelerated First-Order Algorithm

Cloud radio access network (C-RAN) has been recognized as a promising ar...
research
07/18/2016

An Event Grouping Based Algorithm for University Course Timetabling Problem

This paper presents the study of an event grouping based algorithm for a...

Please sign up or login with your details

Forgot password? Click here to reset