An A*-algorithm for the Unordered Tree Edit Distance with Custom Costs

07/26/2021
by   Benjamin Paaßen, et al.
0

The unordered tree edit distance is a natural metric to compute distances between trees without intrinsic child order, such as representations of chemical molecules. While the unordered tree edit distance is MAX SNP-hard in principle, it is feasible for small cases, e.g. via an A* algorithm. Unfortunately, current heuristics for the A* algorithm assume unit costs for deletions, insertions, and replacements, which limits our ability to inject domain knowledge. In this paper, we present three novel heuristics for the A* algorithm that work with custom cost functions. In experiments on two chemical data sets, we show that custom costs make the A* computation faster and improve the error of a 5-nearest neighbor regressor, predicting chemical properties. We also show that, on these data, polynomial edit distances can achieve similar results as the unordered tree edit distance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2018

Revisiting the tree edit distance and its backtracing: A tutorial

Almost 30 years ago, Zhang and Shasha published a seminal paper describi...
research
04/17/2023

Subcubic algorithm for (Unweighted) Unrooted Tree Edit Distance

The tree edit distance problem is a natural generalization of the classi...
research
02/08/2023

Weighted Edit Distance Computation: Strings, Trees and Dyck

Given two strings of length n over alphabet Σ, and an upper bound k on t...
research
09/15/2022

Õ(n+poly(k))-time Algorithm for Bounded Tree Edit Distance

Computing the edit distance of two strings is one of the most basic prob...
research
06/13/2018

Tree Edit Distance Learning via Adaptive Symbol Embeddings

Metric learning has the aim to improve classification accuracy by learni...
research
05/18/2018

Tree Edit Distance Learning via Adaptive Symbol Embeddings: Supplementary Materials and Results

Metric learning has the aim to improve classification accuracy by learni...
research
07/26/2022

Tree edit distance for hierarchical data compatible with HMIL paradigm

We define edit distance for hierarchically structured data compatible wi...

Please sign up or login with your details

Forgot password? Click here to reset