ATRAPOS: Evaluating Metapath Query Workloads in Real Time

01/11/2022
by   Serafeim Chatzopoulos, et al.
0

Heterogeneous information networks (HINs) represent different types of entities and relationships between them. Exploring, analysing, and extracting knowledge from such networks relies on metapath queries that identify pairs of entities connected by relationships of diverse semantics. While the real-time evaluation of metapath query workloads on large, web-scale HINs is highly demanding in computational cost, current approaches do not exploit interrelationships among the queries. In this paper, we present ATRAPOS, a new approach for the real-time evaluation of metapath query workloads that leverages a combination of efficient sparse matrix multiplication and intermediate result caching. ATRAPOS selects intermediate results to cache and reuse by detecting frequent sub-metapaths among workload queries in real time, using a tailor-made data structure, the Overlap Tree, and an associated caching policy. Our experimental study on real data shows that ATRAPOS accelerates exploratory data analysis and mining on HINs, outperforming off-the-shelf caching approaches and state-of-the-art research prototypes in all examined scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2022

Reinforcement Learning Based Approaches to Adaptive Context Caching in Distributed Context Management Systems

Performance metrics-driven context caching has a profound impact on thro...
research
01/09/2020

Topical Result Caching in Web Search Engines

Caching search results is employed in information retrieval systems to e...
research
07/16/2023

Real-Time Analytics by Coordinating Reuse and Work Sharing

Analytical tools often require real-time responses for highly concurrent...
research
02/14/2020

Cleaning Denial Constraint Violations through Relaxation

Data cleaning is a time-consuming process which depends on the data anal...
research
11/03/2017

Toward real-time data query systems in HEP

Exploratory data analysis tools must respond quickly to a user's questio...
research
04/04/2023

High-Throughput Vector Similarity Search in Knowledge Graphs

There is an increasing adoption of machine learning for encoding data in...
research
01/16/2013

Probabilistic Models for Query Approximation with Large Sparse Binary Datasets

Large sparse sets of binary transaction data with millions of records an...

Please sign up or login with your details

Forgot password? Click here to reset