An Efficient Probabilistic Approach for Graph Similarity Search

06/17/2017
by   Zijian Li, et al.
0

Graph similarity search is a common and fundamental operation in graph databases. One of the most popular graph similarity measures is the Graph Edit Distance (GED) mainly because of its broad applicability and high interpretability. Despite its prevalence, exact GED computation is proved to be NP-hard, which could result in unsatisfactory computational efficiency on large graphs. However, exactly accurate search results are usually unnecessary for real-world applications especially when the responsiveness is far more important than the accuracy. Thus, in this paper, we propose a novel probabilistic approach to efficiently estimate GED, which is further leveraged for the graph similarity search. Specifically, we first take branches as elementary structures in graphs, and introduce a novel graph similarity measure by comparing branches between graphs, i.e., Graph Branch Distance (GBD), which can be efficiently calculated in polynomial time. Then, we formulate the relationship between GED and GBD by considering branch variations as the result ascribed to graph edit operations, and model this process by probabilistic approaches. By applying our model, the GED between any two graphs can be efficiently estimated by their GBD, and these estimations are finally utilized in the graph similarity search. Extensive experiments show that our approach has better accuracy, efficiency and scalability than other comparable methods in the graph similarity search over real and synthetic data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2018

Convolutional Set Matching for Graph Similarity

We introduce GSimCNN (Graph Similarity Computation via Convolutional Neu...
research
01/30/2022

Similarity Search on Computational Notebooks

Computational notebook software such as Jupyter Notebook is popular for ...
research
11/30/2020

Combinatorial Learning of Graph Edit Distance via Dynamic Embedding

Graph Edit Distance (GED) is a popular similarity measurement for pairwi...
research
03/31/2021

Efficient Exploration of Interesting Aggregates in RDF Graphs

As large Open Data are increasingly shared as RDF graphs today, there is...
research
10/31/2022

kt-Safety: Graph Release via k-Anonymity and t-Closeness (Technical Report)

In a wide spectrum of real-world applications, it is very important to a...
research
11/15/2021

EmbAssi: Embedding Assignment Costs for Similarity Search in Large Graph Databases

The graph edit distance is an intuitive measure to quantify the dissimil...
research
11/27/2019

Towards Similarity Graphs Constructed by Deep Reinforcement Learning

Similarity graphs are an active research direction for the nearest neigh...

Please sign up or login with your details

Forgot password? Click here to reset