 # Selection from heaps, row-sorted matrices and X+Y using soft heaps

We use soft heaps to obtain simpler optimal algorithms for selecting the k-th smallest item, and the set of k smallest items, from a heap-ordered tree, from a collection of sorted lists, and from X+Y, where X and Y are two unsorted sets. Our results match, and in some ways extend and improve, classical results of Frederickson (1993) and Frederickson and Johnson (1982). In particular, for selecting the k-th smallest item, or the set of k smallest items, from a collection of m sorted lists we obtain a new optimal "output-sensitive" algorithm that performs only O(m+∑_i=1^m (k_i+1)) comparisons, where k_i is the number of items of the i-th list that belong to the overall set of k smallest items.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The input to the standard selection problem is a set of items, drawn from a totally ordered domain, and an integer . The goal is to return the -th smallest item in the set. A classical result of Blum et al.  says that the selection problem can be solved deterministically in time, i.e., faster than sorting the set. The number of comparisons required for selection was reduced by Schönhage et al.  to , and by Dor and Zwick [8, 9] to about .

In the generalized selection problem, we are also given a partial order known to hold for the set of  input items. The goal is again to return the -th smallest item. The corresponding generalized sorting problem was extensively studied. It was shown by Kahn and Saks  that the problem can be solved using only comparisons, where  is the number of linear extensions of . Thus, the information-theoretic lower bound is tight for generalized sorting. The algorithm of Kahn and Saks  performs only comparisons, but may spend much more time on deciding which comparisons to perform. Kahn and Kim  and Cardinal et al.  gave algorithms that perform only comparisons and run in polynomial time.

A moment’s reflection shows that an algorithm that finds the

-th smallest item of a set, must also identify the set of  smallest items of the set.111The information gathered by a comparison-based algorithm corresponds to a partial order which can be represented by a DAG (Directed Acyclic Graph). Every topological sort of the DAG corresponds to a total order of the items consistent with the partial order. Suppose that is claimed to be the -th smallest item and suppose, for the sake of contradiction, that the set  of the items that are incomparable with  is non-empty. Then, there is a topological sort in which is before all the items of , and another topological sort in which is after all the items of , contradicting the fact that is the -th smallest item in all total orders consistent with the partial order. Given a partial order , let be the number of subsets of size  that may possibly be the set of  smallest items in . Then, is clearly a lower bound on the number of comparisons required to select the -th smallest item, or the set of  smallest items. Unlike sorting, this information-theoretic lower bound for selection may be extremely weak. For example, the information-theoretic lower bound for selecting the minimum is only , while comparisons are clearly needed (and are sufficient). To date, there is no characterization of pairs for which the information-theoretic lower bound for selection is tight, nor an alternative general technique to obtain a tight lower bound.

Frederickson and Johnson [12, 13, 14] and Frederickson  studied the generalized selection problem for some interesting specific partial orders. Frederickson  considered the case in which the input items are items of a binary min-heap, i.e., they are arranged in a binary tree, with each item smaller than its two children. Frederickson  gave a complicated algorithm that finds the -th smallest item using only comparisons, matching the information-theoretic lower bound for this case. (Each subtree of size  of the heap, containing the root, can correspond to the set of  smallest items, and there are , the -th Catalan number, such subtrees.)

Frederickson and Johnson [12, 13, 14] considered three other interesting special cases. (i) The input items are in a collection of sorted lists, or equivalently they reside in a row-sorted matrix; (ii) The input items reside in a collection of matrices, where each matrix is both row- and column-sorted; (iii) The input items are , where and are unsorted sets of items.222By we mean the set of pairwise sums . For each of these cases, they present a selection algorithm that matches the information-theoretic lower bound.

We note in passing that sorting is a well studied problem. Fredman  showed that , where , can be sorted using only comparisons, but it is not known how to do it in time. (An intriguing situation!) Fredman  also showed that comparisons are required, if only comparisons between items in , i.e., comparisons of the form , are allowed. Lambert  and Steiger and Streinu  gave algorithms that sort in time using only comparisons. Kane et al. , in a breakthrough result, have shown recently that can be sorted using only comparisons of the form , but it is again not known how to implement their algorithm efficiently.

The median of , on the other hand, can be found in time, and comparisons of items in , as was already shown by Johnson and Mizoguchi  and Johnson and Kashdan . The selection problem from becomes more challenging when .

Frederickson  gives two applications for selection from a binary min-heap. The first is in an algorithm for listing the  smallest spanning trees of an input graph. The second is a certain resource allocation problem. Eppstein  uses the heap selection algorithm in his algorithm for generating the  shortest paths between a pair of vertices in a digraph. As pointed out by Frederickson and Johnson , selection from can be used to compute the Hodges-Lehmann estimator in statistics. Selection from a matrix with sorted rows solves the problem of “optimum discrete distribution of effort” with concave functions, i.e., the problem of maximizing subject to , where the ’s are concave and the ’s are non-negative integers. (See Koopman  and other references in .) Selection from a matrix with sorted rows is also used by Brodal et al.  and Bremner et al. .

The heap selection algorithm of Frederickson  is fairly complicated. The naïve algorithm for the problem runs in time. Frederickson first improves this to , then to , then to , and finally to .

Our first result is a very simple heap selection algorithm obtained by running the naïve algorithm using an auxiliary soft heap instead of a standard heap. Soft heaps, discussed below, are fairly simple data structures whose implementation is not much more complicated than the implementation of standard heaps. Our overall algorithm is thus simple and easy to implement and comprehend.

Relying on our simple heap selection algorithm, we obtain simplified algorithms for selection from row-sorted matrices and from . Selecting the -th item from a row-sorted matrix with  rows using our algorithms requires time, if , and time, if , matching the optimal results of Frederickson and Johnson . Furthermore, we obtain a new optimal “output-sensitive” algorithm whose running time is , where  is the number of items of the -th row that belong to the set of  smallest items in the matrix. We also use our simple heap selection algorithm to obtain simple optimal algorithms for selection from .

Soft heaps are “approximate” priority queues introduced by Chazelle . They form a major building block in his deterministic -time algorithm for finding minimum spanning trees , which is currently the fastest known deterministic algorithm for the problem. Chazelle  also shows that soft heaps can be used to obtain a simple linear time (standard) selection algorithm. (See the next section.) Pettie and Ramachandran  use soft heaps to obtain an optimal deterministic minimum spanning algorithm with a yet unknown running time. A simplified implementation of soft heaps is given in Kaplan et al. .

All algorithms considered in the paper are comparison-based, i.e., the only operations they perform on the input items are pairwise comparisons. In the selection from problem, the algorithms make pairwise comparisons in , in  and in . The number of comparisons performed by the algorithms presented in this paper dominates the total running time of the algorithms.

The rest of the paper is organized as follows. In Section 2 we review the definition of soft heaps. In Section 3 we describe our heap selection algorithms. In Section 3.1 we describe our basic algorithm for selection from binary min-heaps. In Sections 3.2 and 3.3 we extend the algorithm to -ary heaps and then to general heap-ordered trees and forests. In Section 4 we describe our selection algorithms from row-sorted matrices. In Section 4.1 we describe a simple algorithm which is optimal if . In Section 4.2 we build on the algorithm to obtain an optimal algorithm, for . In Sections 4.3 and 4.4 we obtain new results that were not obtained by Frederickson and Johnson . In Section 4.3 we obtain an algorithm, where is the length of the -th row of the matrix. In Section 4.4 we obtain the new optimal output-sensitive algorithm. In Section 5 we present our selection algorithms from . In Section 5.1 we give a simple algorithm, where , . In Section 5.2 we give a simple algorithm, for and . We conclude in Section 6 with some remarks and open problems.

## 2 Soft heaps

Soft heaps, invented by Chazelle , support the following operations:

: Create and return a new, empty soft heap with error parameter .

: Insert item  into soft heap .

: Return a soft heap containing all items in heaps  and , destroying  and .

: Delete from the soft heap and return an item of minimum key in heap .

In Chazelle , extract-min operations are broken into find-min and delete operations. We only need combined extract-min operations. We also do not need meld operations in this paper.

The main difference between soft heaps and regular heaps is that soft heaps are allowed to increase the keys of some of the items in the heap by an arbitrary amount. Such items are said to become corrupt. The soft heap implementation chooses which items to corrupt, and by how much to increase their keys. The only constraint is that for a certain error parameter , the number of corrupt items in the heap is at most , where  is the number of insertions performed so far. The ability to corrupt items allows the implementation of soft heaps to use what Chazelle  calls the “data structures equivalent of car pooling” to reduce the amortized time per operation to , which is the optimal possible dependency on . In the implementation of Kaplan et al. , extract-min operations take amortized time, while all other operations take amortized time. (In the implementation of Chazelle , insert operations take amortized time while the other operations take time.)

An extract-min operation returns an item whose current, possibly corrupt, key is the smallest in the heap. Ties are broken arbitrarily. (Soft heaps usually give many items the same corrupt key, even if initially all keys are distinct.) Each item  thus has two keys associated with it: its original key , and its current key in the soft heap , where . If , then  is corrupt. The current key of an item may increase several times.

At first sight, it may seem that the guarantees provided by soft heaps are extremely weak. The only bound available is on the number of corrupt items currently in the heap. In particular, all items extracted from the heap may be corrupt. Nonetheless, soft heaps prove to be an extremely useful data structure. In particular, they play a key role in the fastest known deterministic algorithm of Chazelle  for finding minimum spanning trees.

Soft heaps can also be used, as shown by Chazelle , to select an approximate median of items. Initialize a soft heap with some error parameter . Insert the  items into the soft heap and then perform extract-min operations. Find the maximum item , with respect to the original keys, among the extracted items. The rank of  is between and . The rank is at least as  is the largest among items. The rank is at most as the items remaining in the soft heap that are smaller than  must be corrupt, so there are at most such items. For, say, , the running time of the algorithm is .

Using a linear time approximate median algorithm, we can easily obtain a linear time algorithm for selecting the -th smallest item. We first compute the true rank  of the approximate median . If we are done. If , we throw away all items larger than . Otherwise, we throw away all items smaller than  and replace  by . We then continue recursively. In time, we reduced to at most , so the total running time is .

In this paper, we show the usefulness of soft heaps in solving generalized selection problems. We obtain simpler algorithms than those known before, and some results that were not known before.

In Chazelle  and Kaplan et al. , soft heaps may corrupt items while performing any type of operation. It is easy, however, to slightly change the implementation of  such that corruptions only occur following extract-min operations. In particular, insert operations do not cause corruption, and an extract-min operation returns an item with a smallest current key at the beginning of the operation. These assumptions simplify algorithms that use soft heaps, and further simplify their analysis. The changes needed in the implementation of soft heaps to meet these assumptions are minimal. The operations insert (and meld) are simply implemented in a lazy way. The implementation of  already has the property that extract-min operations cause corruptions only after extracting an item with minimum current key.

We assume that an extract-min operation returns a pair , where  is the extracted item, and  is a list of items that became corrupt after the extraction of , i.e., items that were not corrupt before the operation, but are corrupt after it. We also assume that is a bit that says whether  is corrupt. (Note that is simply a shorthand for .) It is again easy to change the implementation of  so that extract-min operations return a list  of newly corrupt items, without affecting the amortized running times of the various operations. (In particular, the amortized running time of an extract-min operation is still , independent of the length of . As each item becomes corrupt only once, it is easy to charge the cost of adding an item to  to its insertion into the heap.)

We stress that the assumptions we make on soft heaps in this paper can be met by minor and straightforward modifications of the implementation of Kaplan et al. , as sketched above. No complexities are hidden here. We further believe that due to their usefulness, these assumptions will become the standard assumptions regarding soft heaps.

## 3 Selection from heap-ordered trees

In Section 3.1 we present our simple, soft heap-based, algorithm for selecting the -th smallest item, and the set of  smallest items from a binary min-heap. This algorithm is the cornerstone of this paper. For simplicity, we assume throughout this section that the input heap is infinite. In particular, each item  in the input heap has two children and . (A non-existent child is represented by a dummy item with key .) In Section 3.2 we adapt the algorithm to work for -ary heaps, for , using “on-the-fly ternarization via heapification”. In Section 3.3 we extend the algorithm to work on any heap-ordered tree or forest. The results of Section 3.3 are new.

### 3.1 Selection from binary heaps

The naïve algorithm for selection from a binary min-heap is given in Figure 2. The root of the input heap is inserted into an auxiliary heap (priority queue). The minimal item  is extracted from the heap and appended to a list . The two children of , if they exist, are inserted into the heap. This operation is repeated  times. After iterations, the items in  are the  smallest items in the input heap, in sorted order. Overall, items are inserted into the heap and  items are extracted, so the total running time is , which is optimal if the  smallest items are to be reported in sorted order.

Frederickson  devised a very complicated algorithm that outputs the  smallest items, not necessarily in sorted order, in only time, matching the information-theoretic lower bound. In Figure 2 we give our very simple algorithm for the same task, which also runs in optimal time and performs only comparisons. Our algorithm is a simple modification of the naïve algorithm of Figure 2 with the auxiliary heap replaced by a soft heap. The resulting algorithm is much simpler than the algorithm of Frederickson .

Algorithm begins by initializing a soft heap  with error parameter and by inserting the root  of the input heap into it. Items inserted into the soft heap  are also inserted into a list . The algorithm then performs iterations. In each iteration, the operation extracts an item  with the smallest (possibly corrupt) key currently in , and also returns the set of items that become corrupt as a result of the removal of  from . If  is not corrupt, then it is added to . Now, for each item , we insert its two children and into the soft heap  and the list .

Lemma 3.1 below claims that inserts the  smallest items of the input heap into the soft heap . Lemma 3.2 claims that, overall, only items are inserted into , and hence into . Thus, the smallest items in the input heap can be found by selecting the smallest items in the list  using a standard selection algorithm.

###### Lemma 3.1

Algorithm inserts the  smallest items from the input binary min-heap into the soft heap . (Some of them may subsequently be extracted from the heap.) Figure 3: Types of items in the input heap. White nodes belong to A, i.e., were not inserted yet into the soft heap Q; black nodes belong to the barrier B; gray nodes belong to C, i.e., are corrupt; striped nodes belong to D, i.e., were already deleted.

Proof:  At the beginning of an iteration of algorithm Soft-Select, let be the set of items of the input binary heap that were not yet inserted into the soft heap ; let  be the set of items that were inserted, not yet removed and are not corrupt; let be the set of items that were inserted, not yet removed, and are corrupt; let  be the set of items that were inserted and already deleted from . We prove below, by easy induction, the following two invariants:

• All strict ancestors of items in  are in .

• Each item in  has an ancestor in .

Thus, the items in  form a barrier that separates the items of , i.e., items that were not inserted yet into the heap, from the items of , i.e., items that were inserted and are either corrupt or were already removed from the soft heap . For an example, see Figure 3.

Invariants (a) and (b) clearly hold at the beginning of the first iteration, when and the other sets are empty. Assume that (a) and (b) hold at the beginning of some iteration. Each iteration removes an item from the soft heap. The item removed is either a corrupt item from , or an item (in fact the smallest item) on the barrier . Following the extraction, some items on the barrier  become corrupt and move to . The barrier is ‘mended’ by inserting to  the children of items on  that were extracted or became corrupt. By our simplifying assumption, insertions do not corrupt items, so the newly inserted items belong to  and are thus part of the new barrier, reestablishing (a) and (b).

We now make the following two additional claims:

• The item extracted at each iteration is smaller than or equal to the smallest item on the barrier. (With respect to the original keys.)

• The smallest item on the barrier cannot decrease.

Claim (c) follows immediately from the definition of an extract-min operation and our assumption that corruption occurs only after an extraction. All the items on the barrier, and in particular the smallest item  on the barrier, are in the soft heap and are not corrupt. Thus, the extracted item is either , or a corrupt item whose corrupt key is still smaller than . As corruption can only increase keys, we have .

Claim (d) clearly holds as items on the barrier at the end of an iteration were either on the barrier at the beginning of the iteration, or are children of items that were on the barrier at the beginning of the iteration.

Consider now the smallest item  on the barrier after iterations. As all extracted items are smaller than it, the rank of  is at least . Furthermore, all items smaller than  must be in , i.e., inserted at some stage into the heap. Indeed, let be an item of , i.e., an item not inserted into . By invariant (b), has an ancestor on the barrier. By heap order and the assumption that  is the smallest item on the barrier we indeed get . Thus, the smallest  items were indeed inserted into the soft heap as claimed.

The proof of Lemma 3.1 relies on our assumption that corruptions in the soft heap occur only after extract-min operations. A slight change in the algorithm is needed if insert operations may cause corruptions; we need to repeatedly add children of newly corrupt items until no new items become corrupt. (Lemma 3.2 below shows that this process must end if . The process may not end if .) The algorithm, without any change, remains correct, and in particular Lemma 3.1 holds, if extract-min operations are allowed to corrupt items before extracting an item of minimum (corrupt) key. The proof, however, becomes more complicated. (Claim (c), for example, does not hold in that case.)

###### Lemma 3.2

Algorithm inserts only items into the soft heap .

Proof:  Let be the number of insertions made by , and let be the number of items that become corrupt during the running of the algorithm. (Note that clearly terminates.) Let be the error parameter of the soft heap. We have , as each inserted item is either the root , or a child of an item extracted during one of the iterations of the algorithm, and there are at most such insertions, or a child of a corrupt item, and there are exactly such insertions. We also have , as by the definition of soft heaps, at the end of the process at most items in the soft heap may be corrupt, and as only (possibly corrupt) items were removed from the soft heap. Combining these two inequalities we get , and hence . Thus, if we get

 C<1+2ε1−2εk,I<2(1+1+2ε1−2ε)k.

The number of insertions is therefore , as claimed. (For , .)

Combining the two lemmas we easily get:

###### Theorem 3.3

Algorithm selects the  smallest items of a binary min-heap in time.

Proof:  The correctness of the algorithm follows from Lemmas 3.1 and 3.2. Lemma 3.2 also implies that only operations are performed on the soft heap. As , each operation takes amortized time. The total running time, and the number of comparisons, performed by the loop of is thus . As the size of is , the selection of the smallest items from  also takes only time.

### 3.2 Selection from d-ary heaps

Frederickson  claims, in the last sentence of his paper, that his algorithm for binary min-heaps can be modified to yield an optimal algorithm for -ary min-heaps, for any , but no details are given. (In a -ary heap, each node has (at most) children.)

We present two simple algorithms for selecting the smallest items from a -ary min-heap. The first is a simple modification of the algorithm for the binary case. The second in a simple reduction from the -ary case to the binary case.

Algorithm of Figure 2 can be easily adapted to work on -ary heaps. We simply insert the children of an extracted item, or an item that becomes corrupt, into the soft heap. If we again let  be the number of items inserted into the sort heap, and  be the number of items that become corrupt, we get and , and hence

 C<1+dε1−dεk,I

provided that , e.g., . The algorithm then performs insert operations, each with an amortized cost of , and extract-min operations, each with an amortized cost of . The total running time is therefore . (Note that it is important here to use the soft heap implementation of , with an amortized cost of insert.)

An alternative algorithm for -ary heaps, for any , can be obtained by a simple reduction from -ary heaps to -ary (or binary) heaps using a process that we call “on-the-fly ternarization via heapification”. We use the well-known fact that an array of  items can be heapified, i.e., converted into a binary heap, in time. (See Williams  or Cormen et al. .) We describe this alternative approach because we think it is interesting, and because we use it in the next section to obtain an algorithm for general heap-ordered trees, i.e., trees in which different nodes may have different degrees, and the degrees of the nodes are not necessarily bounded by a constant.

In a -ary heap, each item has (up to) children . We construct a ternary heap on the same set of items in the following way. We heapify the children of , i.e., construct a binary heap whose items are these children. This gives each child  of  two new children and . (Some of these new children are null.) We let be the root of the heap composed of the children of . Overall, this gives each item  in the original heap three new children , and , some of which may be null. Note that gets its new children and when it and its siblings are heapified. (The names left, middle and right are, of course, arbitrary.) For an example, see Figure 4. Figure 4: On-the-fly ternarization of a 7-ary heap. Thin lines represent the original 7-ary heap. Bold arrows represent new left and right children. Dashed arrows represent new middle children.

This heapification process can be carried out on-the-fly while running on the resulting ternary heap. The algorithm starts by inserting the root of the -ary heap, which is also the root of its ternarized version, into the soft heap. When an item  is extracted from the soft heap, or becomes corrupt, we do not immediately insert its  original children into the soft heap. Instead, we heapify its  children, in time. This assigns  its middle child . Item  already has its left and right children and defined. The three new children , and are now inserted into the soft heap. We call the resulting algorithm .

###### Theorem 3.4

Algorithm selects the smallest items from a -ary heap with root  in time.

Proof:  Algorithm essentially works on a ternary version of the input -ary heap constructed on the fly. Simple adaptations of Lemmas 3.1 and 3.2 show that the total running time, excluding the heapifications’ cost, is . As only heapifications are performed, the cost of all heapifications is , giving the total running time of the algorithm.

It is also possible to binarize the input heap on the fly. We first ternarize the heap as above. We now convert the resulting ternary tree into a binary tree using the standard first child, next sibling representation. This converts the ternary heap into a binary heap, if

the three children of each item are sorted. During the ternarization process, we can easily make sure that the three children of each item appear in sorted order, swapping children if necessary, so we can apply this final binarization step.

### 3.3 Selection from general heap-ordered trees

Algorithm works, of course, on arbitrary heap-ordered trees in which different nodes have different degrees. Algorithm , on the other hand, is not easily adapted to work on general heap-ordered trees, as it is unclear how to set the error parameter to obtain an optimal running time. To bound the running time of on an arbitrary heap-ordered tree, we introduce the following definition.

###### Definition 3.5 (D(T,k))

Let be a (possibly infinite) rooted tree and let . Let be the maximum sum of degrees over all subtrees of  of size  rooted at the root of . (The degrees summed are in , not in the subtree.)

For example, if is an infinite -ary tree, then , as the sum of degrees in each subtree of  containing  vertices is . For a more complicated example, let be the infinite tree in which each node at level  has degree , i.e., the root has two children, each of which has three children, etc. Then, , where the subtrees achieving this maximum are paths starting at the root. A simple adaptation of Theorem 3.4 gives:

###### Theorem 3.6

Let be a heap-ordered tree with root . Algorithm selects the -th smallest item in , and the set of  smallest items in , in time.

Proof:  We use the on-the-fly binarization and a soft heap with . The number of corrupt items is less than . The number of extracted items is less than . Thus, the algorithm needs to heapify the children of less than  items that form a subtree  of the original tree . The sum of the degrees of these items is at most , thus the total time spent on the heapifications, which dominates the running time of the algorithm, is . We note that can be replaced by , for any , by choosing small enough.

###### Theorem 3.7

Let be a heap-ordered tree and let . Any comparison-based algorithm for selecting the -th smallest item in  must perform at least comparisons on some inputs.

Proof:  Let be the subtree of  of size  that achieves the value , i.e., the sum of the degrees of the nodes of  is . Suppose the items of  are the smallest items in . The nodes of  have at least children that are not in . The -th smallest item is the minimum item among these items, and no information on the order of these items is implied by the heap order of the tree. Thus, finding the -th smallest item in this case requires at least comparisons.

## 4 Selection from row-sorted matrices

In this section we present algorithms for selecting the smallest items from a row-sorted matrix, or equivalently from a collection of sorted lists. Our results simplify and extend results of Frederickson and Johnson . The algorithms presented in this section use our Soft-Select algorithm for selection from a binary min-heap presented in Section 3.1. (Frederickson’s  algorithm could also be used, but the resulting algorithms would become much more complicated, in particular more complicated than the algorithms of Frederickson and Johnson .)

In Section 4.1 we give an algorithm, where is the number of rows, which is optimal if . In Section 4.2 we give an algorithm which is optimal for . These results match results given by Frederickson and Johnson . In Sections 4.3 and 4.4 we give two new algorithms that improve in some cases over the previous algorithms.

### 4.1 An O(m+k) algorithm

A sorted list may be viewed as a heap-sorted path, i.e., a -ary heap. We can convert a collection of  sorted lists into a (degenerate) binary heap by building a binary tree whose leaves are the first items in the lists. The values of the internal nodes in this tree are set to . Each item in a list will have one real child, its successor in the list, and a dummy child with value . To find the  smallest items in the lists, we simply find the smallest items in the binary heap. This can be done in time using algorithm Soft-Select of Section 3.1. More directly, we can use the following straightforward modification of algorithm Soft-Select. Insert the first items in the lists into a soft heap. Perform iterations in which an item with minimum (corrupt) key is extracted. Insert into the soft heap the child of the item extracted as well as the children of all the items that became corrupt following the extract-min operation.

Alternatively, we can convert the sorted lists into a heap-ordered tree by adding a root with value that will have the first items as its children. All other nodes in the tree will have degree . It is easy to see that . By Theorem 3.6 we again get an algorithm. We have thus presented three different proofs of the following theorem.

###### Theorem 4.1

Let be a row-sorted matrix containing  rows. Then, the -th smallest item in , and the set of  smallest items in , can be found in time.

We refer to the algorithm of Theorem 4.1 as . The running time of is asymptotically optimal if , as is clearly a lower bound; each selection algorithm must examine at least one item in each row of the input matrix.

### 4.2 An O(mlogkm) algorithm, for k≥2m Figure 5: Partitioning the items in each row to blocks of size b. Block representatives are shown as small filled circles. The shaded regions contains the k smallest items. The darkly shaded region depicts blocks all of whose items are among the k smallest.

We begin with a verbal description of the algorithm. Let  be the input matrix and let . Partition each row of the matrix  into blocks of size . The last item in each block is the representative of the block. Consider the (yet unknown) distribution of the  smallest items among the  rows of the matrix. Let be the number of items in the -th row that are among the  smallest items in the whole matrix. These items are clearly the first items of the -th row. They are partitioned into a number of full blocks, followed possibly by one partially filled block. (For an example, see Figure 5.) The number of items in partially filled blocks is at most . Thus, the number of filled blocks is at least

Apply algorithm to select the smallest  block representatives. This clearly takes only time. (Algorithm is applied on the implicitly represented matrix  of block representatives.) All items in the  blocks whose representatives were selected are among the  smallest items of the matrix. The number of such items is , as . These items can be removed from the matrix. All that remains is to select the smallest remaining items using a recursive call to the algorithm. In each recursive call (or iteration), the total work is . The number of items to be selected drops by a factor of at least . Thus after at most iterations,  drops below and then is called to finish the job in time.

Pseudo-code of the algorithm described above, which we call is given in Figure 7. The algorithm returns an array , where is the number of items in the -th row that are among the  smallest items of the matrix. The algorithm uses a function that returns the number of rows of a given matrix, a function that returns an (implicit) representation of a matrix such that , for , and a function that returns an (implicit) representation of a matrix such that , for .

In Figure 7 we eliminate the use of jump and shift and make everything explicit. The input matrix is now represented by a triplet , where is a matrix, is an integer, and is an array of non-negative integral displacements. selects the  smallest items in the matrix such that , for . To select the  smallest items in  itself, we simply call , where represents an array of  zeros. The implementation of in Figure 7 is recursive. It is easy to convert it into an equivalent iterative implementation.

###### Theorem 4.2

Let be a row-sorted matrix containing  rows and let . Algorithm selects the  smallest items in  in time.

Frederickson and Johnson  showed that the running time of is optimal, when . A simple proof of this claim can also be found in Section 4.5.

### 4.3 An O(m+∑mi=1logni) algorithm

Assume now that the -th row of  contains only items. We assume that , as otherwise, we can simply remove the -th row. We can run algorithms and of the previous sections by adding dummy items at the end of each row, but this may be wasteful. We now show that a simple modification of , which we call , can solve the selection problem in time. We focus first on the number of comparisons performed by the new algorithm.

At the beginning of each iteration, sets the block size to . If , then the last item in the first block of the -th row is . Assuming that , no representatives from the -th row will be selected in the current iteration. There is therefore no point in considering the -th row in the current iteration. Let be the number of long rows, i.e., rows for which . We want to reduce the running time of the iteration to and still reduce  by some constant factor.

The total number of items in the short rows is less than . The long rows thus contain at least of the  smallest items of the matrix. We can thus run an iteration of on the long rows with . In other words, we adjust the block size to and use to select the  smallest representatives. This identifies items as belonging to the  smallest items in . Thus, each iteration takes time and reduces  by a factor of at least .

In how many iterations did each row of the matrix participate? Let be the number of items still to be selected at the beginning of iteration . Let be the threshold for long rows used in iteration . As drops exponentially, so does . Thus, row participates in at most of the last iterations of the algorithm. The total number of comparisons performed is thus at most , as claimed.

To show that the algorithm can also be implemented to run in time, we need to show that we can quickly identify the rows that are long enough to participate in each iteration. To do that, we sort using bucket sort. This takes only time. When a row loses some of its items, it is easy to move it to the appropriate bucket in time. In each iteration we may need to examine rows in one bucket that turn out not to be long enough, but this does not affect the total running time of the algorithm.

###### Theorem 4.3

Let be a row-sorted matrix containing  rows, and let , where be the number of items in the -th row of the matrix, for . Let . Algorithm selects the  smallest items in  in time.

In Section 4.5 below we show that the running time of is optimal for some values of and , e.g., if , i.e., for median selection. The running time of is sometimes better than the running time of . We next describe an algorithm, , which is always at least as fast as the three algorithms already presented, and sometimes faster.

### 4.4 An O(m+∑mi=1log(ki+1)) algorithm

As before, let be the (yet unknown) number of items in the -th row that belong to the smallest  items of the matrix. In this section we describe an algorithm for finding these ’s that runs in time.

We partition each row this time into blocks of size . The representative of a block is again the last item in the block. Note that the first items in row reside in complete blocks, plus one incomplete block, if is not an integer. Thus is exactly the number of block representatives that belong to the  smallest items of the matrix.

Suppose that is an upper bound on the true value of . We can run to select the smallest block representatives in time. If representatives were selected from row , we let . We now run which runs in . Thus, if , the total running time is , as promised.

How do we find a tight upper bound on ? We simply try , until we obtain a value of  that is high enough. If , i.e., is not large enough, we can discover it in one of two ways. Either , in which case is clearly too small. Otherwise, the algorithm returns an array of values. We can check whether these values are the correct ones in time. First compute . Next check that , for . As is doubled in each iteration, the cost of the last iteration dominates the total running time which is thus . We call the resulting algorithm .

###### Theorem 4.4

Let be a row-sorted matrix containing  rows and let . Algorithm selects the  smallest items in  in time, where is the number of items selected from row .

### 4.5 Lower bounds for selection from row-sorted matrices

We begin with a simple proof that the algorithm is optimal for .

###### Theorem 4.5

Any algorithm for selecting the smallest items from a matrix with sorted rows must perform at least comparisons on some inputs.

Proof:  We use the information-theoretic lower bound. We need to lower bound , which is the number of -tuples , where , for , and . It is easily seen that , as this is the number of ways to arrange identical balls and identical dividers in a row. We thus get a lower bound of

 log(m+k−1m−1)≥log(m+k−1m−1)m−1=(m−1)logm+k−1m−1≥(m−1)logm+km,

where we used the well-known relation .

We next show that our new algorithm is optimal, at least in some cases, e.g., when which corresponds to median selection.

###### Theorem 4.6

Any algorithm for selecting the smallest items from a row-sorted matrix with rows of lengths must perform at least comparisons on some inputs.

Proof:  The number of possible solutions to the selection problem for all values of is . (Each solution corresponds to a choice , for .) We prove below that the number of solutions is maximized for (and ). The number of possible solutions for this value of is thus at least . Taking logarithm, we get the promised lower bound.

We next prove that the number of solutions is maximized when . Let

be a uniform random variable on

, and let . The number of solutions for a given value

is proportional to the probability that

attains the value . Let . We prove by induction on  that the distribution of  is maximized at , is symmetric around , and is increasing up to  and decreasing after . The base case is obvious as

is a uniform distribution. The induction step follows from an easy calculation. Indeed,

, where is uniform and has the required properties. The distribution of is the convolution of the distributions of and , which corresponds to taking the average of values of the distribution of . It follows easily that also has the required properties.

We next compare the lower bound obtained, , with the upper bound . The subtracted term in the lower bound is dominated by the first term, i.e., , where equality holds only if , for every . When the ’s are large, the subtracted term becomes negligible. Also, as , we have . Thus, the lower and upper bound are always within a constant multiplicative factor of each other.

The optimality of the algorithm also implies the optimality of our new “output-sensitive” algorithm. As , an algorithm that performs less than comparisons on all inputs, for some small enough , would contradict the lower bounds for the algorithm.

## 5 Selection from X+y

We are given two unsorted sets and and we would like to find the -th smallest item, and the set of smallest items, in the set . We assume that , , where .

### 5.1 An O(m+n+k) algorithm

Heapify and heapify , which takes time. Let be the heapified order of , i.e., , whenever the respective items exists. Similarly, let be the heapified order of . Construct a heap of maximum degree 4 representing as follows. The root is . Item , for has four children . Item , for , , has two children , again when the respective items exist. (Basically, this is a heapified version of , where each is the root of a heapified version of .) We can now apply algorithm Soft-Select on this heap. We call the resulting algorithm .

###### Theorem 5.1

Let and be unordered sets of and items respectively. Then, algorithm finds the -th smallest item, and the set of smallest items in , in time.

### 5.2 An O(mlogkm) algorithm, for k≥6m, m≥n

If is sorted, then is a row-sorted matrix, and we can use algorithm of Theorem 4.2. We can sort  in time and get an algorithm. The running time of this algorithm is when , for any fixed . But, for certain values of  and , e.g., and , the cost of sorting is dominant. We show below that the sorting can always be avoided.

We first regress and describe an alternative algorithm for selection from row-sorted matrices. The algorithm is somewhat more complicated than algorithm given in Section 4.2. The advantage of the new algorithm is that much less assumptions are made about the order of the items in each row. A similar approach was used by Frederickson and Johnson  but we believe that our approach is simpler. In particular we rely on a simple partitioning lemma (Lemma