AAE: An Active Auto-Estimator for Improving Graph Storage

06/29/2022
by   Yu Yan, et al.
0

Nowadays, graph becomes an increasingly popular model in many real applications. The efficiency of graph storage is crucial for these applications. Generally speaking, the tune tasks of graph storage rely on the database administrators (DBAs) to find the best graph storage. However, DBAs make the tune decisions by mainly relying on their experiences and intuition. Due to the limitations of DBAs's experiences, the tunes may have an uncertain performance and conduct worse efficiency. In this paper, we observe that an estimator of graph workload has the potential to guarantee the performance of tune operations. Unfortunately, because of the complex characteristics of graph evaluation task, there exists no mature estimator for graph workload. We formulate the evaluation task of graph workload as a classification task and carefully design the feature engineering process, including graph data features, graph workload features and graph storage features. Considering the complex features of graph and the huge time consumption in graph workload execution, it is difficult for the graph workload estimator to obtain enough training set. So, we propose an active auto-estimator (AAE) for the graph workload evaluation by combining the active learning and deep learning. AAE could achieve good evaluation efficiency with limited training set. We test the time efficiency and evaluation accuracy of AAE with two open source graph data, LDBC and Freebase. Experimental results show that our estimator could efficiently complete the graph workload evaluation in milliseconds.

READ FULL TEXT
research
12/16/2019

Lauca: Generating Application-Oriented Synthetic Workloads

The synthetic workload is essential and critical to the performance eval...
research
04/11/2023

An Empirical Evaluation of Columnar Storage Formats

Columnar storage is one of the core components of a modern data analytic...
research
07/30/2023

IWEK: An Interpretable What-If Estimator for Database Knobs

The knobs of modern database management systems have significant impact ...
research
11/27/2020

Introducing the Task-Aware Storage I/O (TASIO) Library

Task-based programming models are excellent tools to parallelize and sea...
research
01/16/2023

KEWS: A Evaluation Method of Workload Simulation based on KPIs

For end-to-end performance testing, workload simulation is an important ...
research
07/10/2023

The LDBC Social Network Benchmark Interactive workload v2: A transactional graph query benchmark with deep delete operations

The LDBC Social Network Benchmark's Interactive workload captures an OLT...
research
12/07/2020

Adaptive Deep Learning for Entity Resolution by Risk Analysis

The state-of-the-art performance on entity resolution (ER) has been achi...

Please sign up or login with your details

Forgot password? Click here to reset