# Selection on X_1+X_2+... + X_m with layer-ordered heaps

Selection on X_1+X_2+... + X_m is an important problem with many applications in areas such as max-convolution, max-product Bayesian inference, calculating most probable isotopes, and computing non-parametric test statistics, among others. Faster-than-naïve approaches exist for m=2: Johnson & Mizoguchi (1978) find the smallest k values in A+B with runtime O(n log(n)). Frederickson & Johnson (1982) created a method for finding the k smallest values in A+B with runtime O(n + min(k,n)log(k/min(k,n))). In 1993, Frederickson published an optimal algorithm for selection on A+B, which runs in O(n+k). In 2018, Kaplan et al. described another optimal algorithm in terms Chazelle's of soft heaps. No fast methods exist for m>2. Johnson & Mizoguchi (1978) introduced a method to compute the minimal k terms when m>2, but that method runs in O(m· n^m/2log(n)) and is inefficient when m ≫ 1. In this paper, we introduce the first efficient methods for problems where m>2. We introduce the “layer-ordered heap,” a simple special class of heap with which we produce a new, fast selection algorithm on the Cartesian product. Using this new algorithm to perform k-selection on the Cartesian product of m arrays of length n has runtime ∈ o(m· n + k· m). We also provide implementations of the algorithms proposed and their performance in practice.

READ FULL TEXT