Kaskade: Graph Views for Efficient Graph Analytics

06/12/2019
by   Joana M. F. da Trindade, et al.
0

Graphs are an increasingly popular way to model real-world entities and relationships between them, ranging from social networks to data lineage graphs and biological datasets. Queries over these large graphs often involve expensive subgraph traversals and complex analytical computations. These real-world graphs are often substantially more structured than a generic vertex-and-edge model would suggest, but this insight has remained mostly unexplored by existing graph engines for graph query optimization purposes. Therefore, in this work, we focus on leveraging structural properties of graphs and queries to automatically derive materialized graph views that can dramatically speed up query evaluation. We present KASKADE, the first graph query optimization framework to exploit materialized graph views for query optimization purposes. KASKADE employs a novel constraint-based view enumeration technique that mines constraints from query workloads and graph schemas, and injects them during view enumeration to significantly reduce the search space of views to be considered. Moreover, it introduces a graph view size estimator to pick the most beneficial views to materialize given a query set and to select the best query evaluation plan given a set of materialized views. We evaluate its performance over real-world graphs, including the provenance graph that we maintain at Microsoft to enable auditing, service analytics, and advanced system optimizations. Our results show that KASKADE substantially reduces the effective graph size and yields significant performance speedups (up to 50X), in some cases making otherwise intractable queries possible.

READ FULL TEXT
research
05/19/2021

Automatic View Selection in Graph Databases

Recently, several works have studied the problem of view selection in gr...
research
04/11/2020

Graphsurge: Graph Analytics on View Collections Using Differential Computation

This paper presents the design and implementation of a new open-source v...
research
03/31/2021

Efficient Exploration of Interesting Aggregates in RDF Graphs

As large Open Data are increasingly shared as RDF graphs today, there is...
research
04/03/2020

Recursive SPARQL for Graph Analytics

Work on knowledge graphs and graph-based data management often focus eit...
research
01/23/2020

Leveraging Neighborhood Summaries for Efficient RDF Queries on RDBMS

Using structural informations to summarize graph-structured RDF data is ...
research
03/23/2021

HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries (Extended Version)

Hybrid complex analytics workloads typically include (i) data management...
research
01/21/2018

Learning to Speed Up Query Planning in Graph Databases

Querying graph structured data is a fundamental operation that enables i...

Please sign up or login with your details

Forgot password? Click here to reset