Fast approximation of search trees on trees with centroid trees
Search trees on trees (STTs) generalize the fundamental binary search tree (BST) data structure: in STTs the underlying search space is an arbitrary tree, whereas in BSTs it is a path. An optimal BST of size n can be computed for a given distribution of queries in O(n^2) time [Knuth 1971] and centroid BSTs provide a nearly-optimal alternative, computable in O(n) time [Mehlhorn 1977]. By contrast, optimal STTs are not known to be computable in polynomial time, and the fastest constant-approximation algorithm runs in O(n^3) time [Berendsohn, Kozma 2022]. Centroid trees can be defined for STTs analogously to BSTs, and they have been used in a wide range of algorithmic applications. In the unweighted case (i.e., for a uniform distribution of queries), a centroid tree can be computed in O(n) time [Brodal et al. 2001; Della Giustina et al. 2019]. These algorithms, however, do not readily extend to the weighted case. Moreover, no approximation guarantees were previously known for centroid trees in either the unweighted or weighted cases. In this paper we revisit centroid trees in a general, weighted setting, and we settle both the algorithmic complexity of constructing them, and the quality of their approximation. For constructing a weighted centroid tree, we give an output-sensitive O(nlog h)⊆ O(nlog n) time algorithm, where h is the height of the resulting centroid tree. If the weights are of polynomial complexity, the running time is O(nloglog n). We show these bounds to be optimal, in a general decision tree model of computation. For approximation, we prove that the cost of a centroid tree is at most twice the optimum, and this guarantee is best possible, both in the weighted and unweighted cases. We also give tight, fine-grained bounds on the approximation-ratio for bounded-degree trees and on the approximation-ratio of more general α-centroid trees.
READ FULL TEXT