Log In Sign Up

Selection on X_1+X_2+... + X_m with layer-ordered heaps

by   Patrick Kreitzberg, et al.

Selection on X_1+X_2+... + X_m is an important problem with many applications in areas such as max-convolution, max-product Bayesian inference, calculating most probable isotopes, and computing non-parametric test statistics, among others. Faster-than-naïve approaches exist for m=2: Johnson & Mizoguchi (1978) find the smallest k values in A+B with runtime O(n log(n)). Frederickson & Johnson (1982) created a method for finding the k smallest values in A+B with runtime O(n + min(k,n)log(k/min(k,n))). In 1993, Frederickson published an optimal algorithm for selection on A+B, which runs in O(n+k). In 2018, Kaplan et al. described another optimal algorithm in terms Chazelle's of soft heaps. No fast methods exist for m>2. Johnson & Mizoguchi (1978) introduced a method to compute the minimal k terms when m>2, but that method runs in O(m· n^m/2log(n)) and is inefficient when m ≫ 1. In this paper, we introduce the first efficient methods for problems where m>2. We introduce the “layer-ordered heap,” a simple special class of heap with which we produce a new, fast selection algorithm on the Cartesian product. Using this new algorithm to perform k-selection on the Cartesian product of m arrays of length n has runtime ∈ o(m· n + k· m). We also provide implementations of the algorithms proposed and their performance in practice.


page 1

page 2

page 3

page 4


Optimal selection on X+Y simplified with layer-ordered heaps

Selection on the Cartesian sum, A+B, is a classic and important problem....

Selection on X_1 + X_1 + ⋯ X_m via Cartesian product tree

Selection on the Cartesian product is a classic problem in computer scie...

Optimal construction of a layer-ordered heap

The layer-ordered heap (LOH) is a simple, recently proposed data structu...

Fast exact computation of the k most abundant isotope peaks with layer-ordered heaps

The theoretical computation of isotopic distribution of compounds is cru...

A Bounded p-norm Approximation of Max-Convolution for Sub-Quadratic Bayesian Inference on Additive Factors

Max-convolution is an important problem closely resembling standard conv...

An Illuminating Algorithm for the Light Bulb Problem

The Light Bulb Problem is one of the most basic problems in data analysi...