 # Sorted Top-k in Rounds

We consider the sorted top-k problem whose goal is to recover the top-k items with the correct order out of n items using pairwise comparisons. In many applications, multiple rounds of interaction can be costly. We restrict our attention to algorithms with a constant number of rounds r and try to minimize the sample complexity, i.e. the number of comparisons. When the comparisons are noiseless, we characterize how the optimal sample complexity depends on the number of rounds (up to a polylogarithmic factor for general r and up to a constant factor for r=1 or 2). In particular, the sample complexity is Θ(n^2) for r=1, Θ(n√(k) + n^4/3) for r=2 and Θ̃(n^2/r k^(r-1)/r + n) for r ≥ 3. We extend our results of sorted top-k to the noisy case where each comparison is correct with probability 2/3. When r=1 or 2, we show that the sample complexity gets an extra Θ((k)) factor when we transition from the noiseless case to the noisy case. We also prove new results for top-k and sorting in the noisy case. We believe our techniques can be generally useful for understanding the trade-off between round complexities and sample complexities of rank aggregation problems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Rank aggregation is a fundamental problem which finds numerous applications in recommendation systems, web search, social choice, peer grading and crowdsourcing. The most studied problem in rank aggregation is sorting. It aims to find the total ordering of all items. People also consider the top- problem when it is only necessary to recover the set of top- items. However, for some applications, the rank aggregation task required is neither sorting nor top-. For example, when a recommendation system shows the user a list of items, it might want to display these items in the order of recommendation. As another example, in a tournament, people usually care about the exact rankings of top players but not others.

These examples motivate us to study the sorted top- problem which lies between sorting and top-. In this problem, we have items with an underlying order and the goal is to recover the top- items with the correct order using pairwise comparisons.

In many applications, multiple rounds of interaction are costly. For example, if we collect comparison data via crowdsourcing, the comparisons can be done in parallel by different crowd workers and the total amount of time spent is mainly decided by the number of rounds. Therefore we consider the sorted top- problem in the parallel comparison model introduced by [Valiant(1975)]. In this model, an algorithm performs a set of comparisons in each round and the actual set can depend on comparison results of previous rounds. The goal is to solve the task in bounded number of rounds while minimizing the sample complexity, i.e. the total number of comparisons.

Without the constraint on the number of rounds, sorted top- can be easily solved by combining sorting and top- algorithms: we can first use a top- algorithm to find the set of top- items and then use a sorting algorithm to sort these items. If we are using sorting and top- algorithms with optimal sample complexities, one can easily show that the combination gives a sorted top- algorithm with optimal sample complexity (up to a constant factor).

However, if we only have bounded number of rounds, such combining algorithm might not give the optimal sample complexity. The adaptiveness of such combining procedure splits the rounds into rounds used by the top- algorithm and the rounds used by the sorting algorithm. As we will see later, this is not the optimal way to solve sorted top- in bounded number of rounds.

In this paper, we show optimal sample complexity bounds (up to poly logarithmic factors) of sorted top- in rounds for any constant , as shown in Table 1. Our bounds are tight up to constant factors when or .

We further extend our results to sorted top- in the noisy case where each comparison is correct with probability . This is a very simple and basic noise model. As shown in Table 2, we get tight bounds (up to a constant factor) when or 2. Compared with the sample complexity in the noiseless case, the sample complexity in the noisy case just has an extra factor when or 2.

Our techniques also give new results for top- and sorting in the noisy case. Our sorted top- algorithms are based on our top- algorithms. For top- in the noisy case, we show the tight sample complexity bounds are for and for . On the other hand, sorting is a sub-case of sorted top- by picking . Our sorted top- results imply tight bounds for sorting in the noisy case: for and for .

### 1.1 Related Work

Sorted top- is first discussed in [Chambers(1971)] and is referred as “partial sorting”.

The noisy comparison model was introduced by [Feige et al.(1994)Feige, Raghavan, Peleg, and Upfal]. Recently, there are several work studying top- with noisy comparisons in bounded number of rounds [Braverman et al.(2016)Braverman, Mao, and Weinberg, Agarwal et al.(2017)Agarwal, Agarwal, Assadi, and Khanna, Cohen-Addad et al.(2018)Cohen-Addad, Mallmann-Trenn, and Mathieu]. [Braverman et al.(2016)Braverman, Mao, and Weinberg] shows that the sample complexity of top- in the noisy case is when and when and . [Cohen-Addad et al.(2018)Cohen-Addad, Mallmann-Trenn, and Mathieu] gives tight sample complexity bound for general and . The sample complexity of top- in the noisy case for and 2 is not addressed in previous work.

## 2 Model and Preliminaries

We consider the sorted top- problem together with two related problems: sorting and top-. In these problems, there is a set of items with an underlying order and the goals are different:

• Sorted top-: output the sorted list of items with highest ranks.

• Sorting: output the ranks of all items.

• Top-: output the set of items with highest ranks.

Algorithms are allowed to make pairwise comparisons. And we want the algorithm to minimize the sample complexity, i.e. the total number of comparisons. We have two cases, with respect to comparisons: the noiseless case and the noisy case. In the noiseless case, comparisons results are always consistent with the underlying order. In the noisy case, each pairwise comparison is correct (consistent with the underlying order) with some constant probability independently. Without loss of generality, we assume each comparison is correct with probability independently.

We consider algorithms with bounded number of rounds. In each round, an algorithm needs to perform all comparisons simultaneously. We use to denote the number of rounds and we only consider cases when is a fixed constant.

We allow algorithms to use randomness. In the noiseless case, the algorithm needs to be always correct and the sample complexity is the expected total number of comparisons. In the noisy case, because of the noise, no algorithms can always be correct. The algorithm needs to be correct with probability and the sample complexity is the worst-case total number of comparisons. Notice that the requirement in the noiseless case is stronger as the always correct algorithm with expected number of comparisons can be easily made into an algorithm which succeeds with probability and worst-case number of comparisons by halting the algorithm when making too many comparisons.

## 3 Main Results and Proof Overviews

In this section, we show our main results and give overviews of our proof techniques.

### 3.1 Sorted Top-k in the Noiseless Case

In this sub-section, we show our results for sorted top- in the noiseless case. All the detailed discussions and proofs can be found in Section A.

When we only have 1 round (i.e. ), it is not hard to show that comparing all pairs of items using comparisons gives optimal sample complexity (up to a constant factor). We formally discuss this in the beginning of Section A. For , we have the following two main theorems for upper and lower bounds.

For , there exists an -round algorithm that solves sorted top- with comparisons in expectation. There exists a -round algorithm that solves sorted top- with comparisons in expectation.

The main idea of the algorithm in Theorem 3.1 is to use “pivot items”. These pivot items are compared to all items. From these comparisons, we learn not only their ranks but also which items rank between two pivot items. After that, items are partitioned into chunks and we just need to solve sub problems inside chunks. See Figure 1 for a graphical view of pivot items.

Suppose we plan to use comparisons. The most naive way of using pivot items is to pick pivot items at random in the first round and compare them to all items in the same round. After this round, we will be left with chunks of items partitioned by pivot items. Since now we know the ranks of pivot items, we know which chunks have top- items and we only need to care about these chunks. We use the remaining round to run the -round sorting algorithm of [Alon et al.(1986)Alon, Azar, and Vishkin] in each such chunk in parallel. This approach with proper setting of matches the optimal sample complexity bound (up to a constant factor) if or is larger than the expected chunk size (i.e. ). It is formally described in Algorithm 2 in Section A. See also Figure 2 for a graphical view of the algorithm.

However, when and is small enough so that the first chunk is likely to have size much larger than (see Figure 3), the above approach becomes sub-optimal. At a high level, the reason is that the random pivot items chosen in the first round are not good enough to partition the top- items into small chunks. Therefore, instead of running the sorting algorithm on the first chunk in the remaining rounds, we spend one more round (round 2) to compare items to more “accurate” pivot items, partition them into smaller chunks and sort each chunk in the remaining rounds. These new pivot items are better than the random pivot items for two reasons: (i) We can spend some comparisons in the first round to choose these pivot items. So they have better structural guarantees than the random pivot items. In particular, we extend the algorithmic technique of [Braverman et al.(2016)Braverman, Mao, and Weinberg] to pick pivot items which are roughly apart. (ii) Since these pivot items are compared to other items in the second round, we can use comparison results of the random pivot items. Knowing the fact that all top- items are in the first chunk partitioned by the random pivot items, we just need to compare new pivot items to items in that chunk. This is important for getting good sample complexity. The whole process of this paragraph is formally described in Algorithm 3 in Section A. See also Figure 3 for a graphical view of the algorithm.

In Section A, we combine the above two approaches to prove Theorem 3.1. Both approaches use pivot items and then bounded-round sorting. Although the sample complexity keeps decreasing when we increase the number of rounds, we have at most 2 rounds that are different from sorting no matter how large the total number of rounds is. One may wonder why we don’t use more rounds before we apply bounded-round sorting. At a high level, the reason is that what we do before sorting is more similar to a top- algorithm and more than 3 rounds of interaction do not help much with the sample complexity for top-, e.g. [Braverman et al.(2016)Braverman, Mao, and Weinberg] shows a 3-round noiseless top- algorithm with nearly optimal sample complexity .

For , -round algorithm needs comparisons in expectation to solve sorted top-. Any -round algorithm needs comparisons in expectation to solve sorted top-.

Theorem 3.1 gives matching (up to constant or polylog factors) lower bounds compared to upper bounds in Theorem 3.1. The start point of the proof is to reduce from top- or sorting to sorted top-. Indeed, sorted top- is no easier than sorting items or finding top- items over items. However, this reduction is not good enough to give us tight lower bounds.

Let us we go back to our sorted top- algorithm in Theorem 3.1 and compare how it is different from an algorithm which is given the set of top- items and just sorts these items. The main difference is that our sorted top- algorithm spends a big fraction of comparisons in the first one or two rounds on items which are not top- items. These comparisons are not useful for sorting the top- items. Moreover, we can show that, without knowing the set of top- items, not only our algorithm but also any other algorithms will make a good amount of comparisons outside top- in the first one or two rounds. For example, it is not hard show that in expectation at most fraction of first-round comparisons are between two items in top-. The argument for the second round is more complicated.

This is the critical point of our proof. Now we know that sorted top- is no easier than sorting items with unbalanced number of comparisons in rounds (fewer comparisons in the first one or two rounds). In the rest of proof, we adapt the lower bound of bounded-round sorting (Theorem 2.1 of [Alon and Azar(1988b)]) to our unbalanced setting. For details, see Section A.2.

### 3.2 Warm-up: Top-1 in the Noisy Case

Now we proceed to the noisy case. First of all, one could easily adapt a noiseless algorithm into the noisy case by repeating each comparison times and use union bound in the analysis. So the interesting question here is whether the sample complexity gets an extra factor or not or something in-between, when we transition from the noiseless case to the noisy case.

In this sub-section, we show an 1-round algorithm for finding top-1 in the noisy case with comparisons. The sample complexity does not get a blow-up in the noisy case. This algorithm is simpler than and different from our 1-round algorithms for top- and sorted top- in the noisy case. We offer it here as a warm-up for the noisy case.

Without loss of generality, we assume is a power of 2. If is not a power of 2, we could add fewer than dummy items to make the total number of items a power of 2. This only increase the number of comparisons by a constant factor.

In Algorithm 1, we show our main recursive procedure of finding the top-1 item in some set of size . We will show by induction in Lemma 3.2 that it uses comparisons and succeeds with probability at least . If we run , we will get an algorithm for finding top-1 within items with probability at least in the noisy case using comparisons. Notice that although the procedure is defined recursively, no pair of items in an comparison depends on other comparisons’ results. So all the comparisons can be done in 1-round.

This recursive procedure basically partitions set into two set and of equal sizes and then find the max in each set recursively: item and item . After that it compares and some times to find the max of this two. In order to make all comparisons in 1 round, we actually compare all pairs of items for and .

In order to make this recursive procedure to succeed with probability , we want that the following three steps all succeed with probability and we take a union bound: (1) finding the max of : item (2) finding the max of : item (3) finding the max of and . As you will see in the proof, the critical point of the argument is to show that the growth in the success probability (from to ) has much less effect on the sample complexity compared with the decrease of the set size (from to ).

Let . FindMax(, , ) and its descendants use at most comparisons. FindMax(, , ) succeeds to output the top-1 in with probability at least .

We know that is always a power of 2, i.e. . We prove the lemma by induction on . The base case is trivial.

Let’s assume the lemma is true for , let’s consider the case for . By induction hypothesis, we know the number of comparisons in and its descendants is at most . Same for . Therefore, the total number of comparisons used by FindMax(, , ) and its descendants is

 2⋅100(s/2)2log(3/δ)+100log(1/δ)⋅(s/2)2 = 100⋅s24⋅log(1/δ)(2+2log(3)log(1/δ)+1) ≤ 100s2log(1/δ).

In the case that is the top-1 of , is the top-1 of and the majority of comparisons between and is consistent with their true ordering, FindMax(, , ) succeeds to output the top-1 in . By induction hypothesis, each of the first two events happens with probability at least . By Chernoff bound, the third event happens with probability at least . Therefore, by union bound, FindMax(, , ) succeeds to output the top-1 in with probability at least .

### 3.3 Top-k in the Noisy Case

In this sub-section, we discuss top- in the noisy case. All the detailed discussions and proofs about this sub-section can be found in Section B.

As discussed in the related work, top- in the noisy case has been studied in prior work when . Nothing is known when or . On the other hand, if we go back to the noiseless case, it has been shown in prior work that the sample complexity of top- is for and for .

We show top- algorithms in the noisy case in Theorem 3.3 for or . These upper bounds are tight up to a constant factor as they even match the lower bounds in the noiseless case. In other words, for top- in 1 round or 2 rounds, the sample complexity does not get an extra factor when we go from the noiseless case to the noisy case.

For top- in the noisy case, there is an 1-round algorithm with sample complexity and a 2-round algorithm with sample complexity .

Our 1-round algorithm starts by the simple idea of comparing two items times. The majority of these comparison is consistent with the true ordering with probability . By taking a union bound later in the analysis, noisy comparisons between the same pair of two items can be considered as one noiseless comparison between them.

Since we plan to repeat each comparison times and we have only comparisons, we cannot compare all pairs of items. So we use pivot items again. We pick pivot items at random and compare them to all items times. We can partition items into chunks. For items rank before or after the chunk which contains the

-th item, we can easily classify them as in the top-

or outside top-. For items inside this chunk, we don’t know which ones are in top-. Since each chunk has items in expectation, the number of such items is in expectation.

How do we deal with these items? We use more random pivot items. We pick random pivot items and further partition items into chunks of size in expectation. We call these new pivot items as second-level pivot items and previous pivot items as first-level pivot items (see Figure 4). Here comes to the critical point of the argument: since second-level pivot items are only used to partition items and in the analysis we are taking union bound over events, we don’t need to repeat the comparison between each pair times. Instead, we just need to repeat each comparison times. And our total number comparisons will still be .

Finally we generalize this idea to have levels of pivot items and we can classify all items into top- or bottom-. Moreover, although these pivot items are divided into different levels, they are all chosen at random and compared to all items. So all the comparisons can be placed in a single round. The whole algorithm is formally described in Algorithm 4 in Section B. See also Figure 4 for a graphical view of the algorithm.

Now we proceed to describing our 2-round top- algorithm in the noisy case. It is the most sophisticated algorithm in this paper. We are going use comparisons.

It would be good to first understand the 2-round top- algorithm in the noiseless case with comparisons. The idea is quite simple: we pick random pivot items in the first round and partition items into chunks of size in expectation. And in the second round, we just need to focus on the chunk containing the -th item. It has size in expectation and we can just compare all pairs of items in this chunk.

Now in the noisy case, how do we still use only comparisons to find top- in 2 rounds? Repeating each comparison times does not seem to work since it reduces the number of random pivot items to and we will leave a chunk of too many items (i.e. items) to the second round.

In our 2-round algorithm, we still use random pivot items in the first round. We can only compare them to all items constant times as we only have comparisons in total. We partition items into chunks as following (see also Figure 5 for a graphical view). We first put pivot items in the order of their ranks. We get this order correctly with probability after the first round by having comparison between each pair of pivot items in parallel with other comparisons. For each item , we keep a counter and compare it to pivot items one by one. The counter is increased by one if the pivot item wins the majority of comparisons with item and decreased by one otherwise. In the end, we put the item into the chunk next to the pivot item where its counter reaches its maximum. The analysis of this process is similar to a biased random walk. Notice that item ’s counter has higher chance of increasing before it reaches its actual chunk and it has higher chance of decreasing after it reaches its actual chunk. We can show that although we may fail to put item into its actual chunk, the probability it is placed chunks away from its actual chunk can be bounded by .

After the first round, we partition items into chunks of size in expectation. As discussed above, this partition is not perfectly correct but items won’t be placed too far away from their actual chunks. If the partition is perfectly correct, then we can directly use our previous 1-round top- algorithm described above as a blackbox to deal with the chunk containing the -th item. But since the partition is not perfectly correct, we have to modify the 1-round algorithm to use in the second round of our 2-round algorithm. The whole algorithm is formally described in Algorithm 5 in Section B.

### 3.4 Sorted Top-k in the Noisy Case

In this sub-section, we show our results for sorted top- in the noisy case. All the detailed discussions and proofs can be found in Section C. In the noisy case for round number , as described in the previous sub-section, we can adapt our noiseless algorithm into a noisy algorithm with sample complexity . For , we show tight (up to a constant factor) sample complexity bounds in Theorem 3.4.

The sample complexity of sorted top- in the noisy case is for and for .

Both of our sorted top- algorithms in Theorem 3.4 are based on our top- algorithms. Our 1-round sorted top- algorithm is relatively simple given our 1-round top- algorithm. We just compare all pairs times and also runs the 1-round algorithm for top- of Theorem 3.3 in the same round. After we make the comparisons, we learn the set of top- items. The majority of comparisons between each pair is consistent to the actual rankings with probability . Since we only focus on items, we can take the union bound to show our algorithm is correct with probability at least . For details, see Algorithm 8.

Interestingly, although sorted top- is no easier than top-, getting tight bounds of sorted top- could be easier. Our 2-round sorted top- algorithm is much simpler than our 2-round top- algorithm, and it only depends on our 1-round top- algorithm. When is not tiny (), since , we just use the sorted top- algorithm in the noiseless case (Theorem 3.1) and repeat each comparison times. When is tiny (), we partition all items into random groups; find the top-1 of each group in the first round (using the 1-round top- algorithm of Theorem 3.3) and then find the sorted top- of all these top-1’s in the second round. For details, see Algorithm 9.

Now we start to describe how we prove the matching lower bounds. We start with the 1-round lower bound. The main idea of the lower bound is to show that if an algorithm does not make enough comparisons in one round, there must exist pairs of items who have consecutive ranks and are in top-, such that they are compared fewer than times. For any one such pair of items, if we just swap their ranks, the order of items in top- changes and we can show that the chance of seeing the same comparison result would at most decrease by a factor of . As long as the number of such pairs is much larger than this factor, we can show that an 1-round algorithm with not enough comparisons outputs incorrectly with large probability. For details, see Lemma C.1.

For the 2-round lower bound, we still want to show that if an algorithm does not make enough comparisons in two rounds, there must exist enough pairs of items who have consecutive ranks and are in top-, such that they are compared fewer than times for some small constant . The proof is more complicated as the a 2-round algorithms have adaptiveness, i.e. which items are compared in the second round depend on the comparison results of the first round. The main idea of the proof is to show that bounded amount of first round comparisons won’t be helpful to figure out which items are consecutively ranked. We explain proof steps in the case when here. We divide top- items into chunks of size . We show that after the first round, in a typical chunk, constant fraction of items are compared to any items in the same chunk fewer than times. We can then show that, given the first round comparison results, there are pairs of items who could be a consecutively ranked pair with not small chance. As the algorithm has (with a small enough constant factor) comparisons, we can conclude the algorithm could miss to compare many consecutively ranked pairs times. The rest of the argument is similar to the 1-round lower bound. For details, see Lemma C.2.

## 4 Conclusion and Open Problems

In this paper, we characterize the optimal trade-off between the sample complexity and the round complexity of sorted top- in both the noiseless case and the noisy case. For a fixed number of rounds, our sample complexity bound is tight up to a polylogarithmic factor.

When or , we can make our sample complexity bound of sorted top- tight up to a constant factor. We extend these results to top- and sorting. These bounds also allow us to study the blow up in the sample complexity when we transition from the noiseless case to the noisy case. Interestingly, for or , this blow up is different in different rank aggregation problems: in top-, in sorted top- and in sorting.

There are mainly two obstacles to getting tighter bounds for top-, sorting and sorted top- when we have more than 2 rounds. We list them as open problems here. The first one is that we don’t have tight (up to a constant factor) sample complexity bounds even in the noiseless case.

###### Open Problem

Get tight (up to a constant factor) sample complexity bounds for the noiseless case when .

In particular, the first step is to consider 3-round top- in the noiseless case. [Braverman et al.(2016)Braverman, Mao, and Weinberg] shows its sample complexity is . [Bollobás and Brightwell(1990)] shows that no 3-round algorithm with comparisons can find top- correctly with probability . If we only want to succeed with constant probability (for example ), the best lower bound is the trivial one: .

Once we have a good understanding of the noiseless case, we can start to think about the noisy case for .

###### Open Problem

Extend our techniques for or in the noisy case to the case when we have more than 2 rounds.

In the noisy case, our 2-round bounds are very different and more complicated compared to 1-round bounds. Even if we have tight bounds in the noiseless case, getting tight bounds for more than 2 rounds could be more difficult and might require new techniques.

We would like to thank Claire Mathieu for earlier discussions of this problem.

## References

• [Agarwal et al.(2017)Agarwal, Agarwal, Assadi, and Khanna] Arpit Agarwal, Shivani Agarwal, Sepehr Assadi, and Sanjeev Khanna. Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Proceedings of the 30th Conference on Learning Theory, COLT 2017, Amsterdam, The Netherlands, 7-10 July 2017, pages 39–75, 2017.
• [Ailon(2011)] N. Ailon. Active learning ranking from pairwise preferences with almost optimal query complexity. In Advances in Neural Information Processing Systems, 2011.
• [Ailon et al.(2008)Ailon, Charikar, and Newman] N Ailon, M. Charikar, and A. Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM, 55(5):23:1–23:27, 2008.
• [Ajtai et al.(1986)Ajtai, Komlos, Steiger, and Szemeredi] M Ajtai, J Komlos, W L Steiger, and E Szemeredi. Deterministic selection in o(loglog n) parallel time. In

Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing

, STOC ’86, pages 188–195, New York, NY, USA, 1986. ACM.
ISBN 0-89791-193-8. doi: 10.1145/12130.12149.
• [Ajtai et al.(1983)Ajtai, Komlós, and Szemerédi] Miklós Ajtai, János Komlós, and Endre Szemerédi. An o(n log n) sorting network. In Proceedings of the 15th Annual ACM Symposium on Theory of Computing, 25-27 April, 1983, Boston, Massachusetts, USA, pages 1–9, 1983.
• [Akl(1990)] Selim G. Akl. Parallel Sorting Algorithms. Academic Press, Inc., Orlando, FL, USA, 1990. ISBN 0120476800.
• [Alon(1985)] Noga Alon. Expanders, sorting in rounds and superconcentrators of limited depth. In Proceedings of the 17th Annual ACM Symposium on Theory of Computing, May 6-8, 1985, Providence, Rhode Island, USA, pages 98–102, 1985. doi: 10.1145/22145.22156.
• [Alon and Azar(1988a)] Noga Alon and Yossi Azar. Sorting, approximate sorting, and searching in rounds. SIAM J. Discrete Math., 1(3):269–280, 1988a. doi: 10.1137/0401028.
• [Alon and Azar(1988b)] Noga Alon and Yossi Azar. The average complexity of deterministic and randomized parallel comparison-sorting algorithms. SIAM J. Comput., 17(6):1178–1192, 1988b. doi: 10.1137/0217074.
• [Alon et al.(1986)Alon, Azar, and Vishkin] Noga Alon, Yossi Azar, and Uzi Vishkin. Tight complexity bounds for parallel comparison sorting. In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27-29 October 1986, pages 502–510, 1986. doi: 10.1109/SFCS.1986.57.
• [Azar and Pippenger(1990)] Yossi Azar and Nicholas Pippenger. Parallel selection. Discrete Applied Mathematics, 27(1-2):49–58, 1990.
• [Azar and Vishkin(1987)] Yossi Azar and Uzi Vishkin. Tight comparison bounds on the complexity of parallel sorting. SIAM J. Comput., 16(3):458–464, 1987. doi: 10.1137/0216032.
• [Bollobás and Brightwell(1990)] Béla Bollobás and Graham Brightwell. Parallel selection with high probability. SIAM J. Discrete Math., 3(1):21–31, 1990. doi: 10.1137/0403003.
• [Bollobás and Hell(1985)] Béla Bollobás and Pavol Hell. Sorting and graphs. In Ivan Rival, editor, Graphs and Order, volume 147 of NATO ASI Series, pages 169–184. Springer Netherlands, 1985. ISBN 978-94-010-8848-0.
• [Bollobás and Thomason(1983)] Béla Bollobás and Andrew Thomason. Parallel sorting. Discrete Applied Mathematics, 6(1):1 – 11, 1983. ISSN 0166-218X.
• [Braverman and Mossel(2008)] Mark Braverman and Elchanan Mossel. Noisy sorting without resampling. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, pages 268–276, Philadelphia, PA, USA, 2008. Society for Industrial and Applied Mathematics.
• [Braverman and Mossel(2009)] Mark Braverman and Elchanan Mossel. Sorting from noisy information. CoRR, abs/0910.1191, 2009. URL http://arxiv.org/abs/0910.1191.
• [Braverman et al.(2016)Braverman, Mao, and Weinberg] Mark Braverman, Jieming Mao, and S. Matthew Weinberg. Parallel algorithms for select and partition with noisy comparisons. In Proceedings of the Forty-eighth Annual ACM Symposium on Theory of Computing, STOC ’16, pages 851–862, New York, NY, USA, 2016. ACM. ISBN 978-1-4503-4132-5.
• [Chambers(1971)] J. M. Chambers. Algorithm 410: Partial sorting. Commun. ACM, 14(5):357–358, May 1971. ISSN 0001-0782.
• [Chen et al.(2017)Chen, Gopi, Mao, and Schneider] X. Chen, S. Gopi, J. Mao, and J. Schneider. Competitive analysis of the top- ranking problem. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), 2017.
• [Chen et al.(2018)Chen, Li, and Mao] Xi Chen, Yuanzhi Li, and Jieming Mao.

A nearly instance optimal algorithm for top-k ranking under the multinomial logit model.

In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, 2018.
• [Chen and Suh(2015)] Yuxin Chen and Changho Suh. Spectral MLE: top-k rank aggregation from pairwise comparisons. In

Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015

, pages 371–380, 2015.
• [Feige et al.(1994)Feige, Raghavan, Peleg, and Upfal] Uriel Feige, Prabhakar Raghavan, David Peleg, and Eli Upfal. Computing with noisy information. SIAM J. Comput., 23(5):1001–1018, 1994.
• [Häggkvist and Hell(1981)] Roland Häggkvist and Pavol Hell. Parallel sorting with constant time for comparisons. SIAM J. Comput., 10(3):465–472, 1981. doi: 10.1137/0210034.
• [Jamieson and Nowak(2011)] K. Jamieson and R. Nowak. Active ranking using pairwise comparisons. In Advances in Neural Information Processing Systems, 2011.
• [Kenyon-Mathieu and Schudy(2007)] C. Kenyon-Mathieu and W. Schudy. How to rank with few errors. In Proceedings of the Symposium on Theory of computing (STOC), 2007.
• [Kruskal(1983)] Clyde P. Kruskal. Searching, merging, and sorting in parallel computation. IEEE Trans. Computers, 32(10):942–946, 1983.
• [Leighton(1984)] Frank Thomson Leighton. Tight bounds on the complexity of parallel sorting. In Proceedings of the 16th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1984, Washington, DC, USA, pages 71–80, 1984.
• [Lu and Boutilier(2011)] T. Lu and C. Boutilier. Learning mallows models with pairwise preferences. In Proceedings of the International Conference on Machine Learning (ICML), 2011.
• [Makarychev et al.(2013)Makarychev, Makarychev, and Vijayaraghavan] Konstantin Makarychev, Yury Makarychev, and Aravindan Vijayaraghavan. Sorting noisy data with partial information. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, ITCS ’13, pages 515–528, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1859-4.
• [Mohajer and Suh(2016)] S. Mohajer and C. Suh. Active top-k ranking from noisy comparisons. In Proceedings of the 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2016.
• [Negahban et al.(2017)Negahban, Oh, and Shah] S. Negahban, S. Oh, and D. Shah. Rank centrality: Ranking from pair-wise comparisons. Operations Research, 65(1):266–287, 2017.
• [Panconesi and Srinivasan(1997)] Alessandro Panconesi and Aravind Srinivasan. Randomized distributed edge coloring via an extension of the chernoff–hoeffding bounds. SIAM J. Comput., 26(2):350–368, April 1997. ISSN 0097-5397.
• [Pippenger(1987)] Nicholas Pippenger. Sorting and selecting in rounds. SIAM J. Comput., 16(6):1032–1038, 1987. doi: 10.1137/0216066.
• [Rajkumar and Agarwal(2014)] A. Rajkumar and S. Agarwal. A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In Proceedings of the International Conference on Machine Learning (ICML), 2014.
• [Reischuk(1981)] Rüdiger Reischuk. A fast probabilistic parallel sorting algorithm. In 22nd Annual Symposium on Foundations of Computer Science, Nashville, Tennessee, USA, 28-30 October 1981, pages 212–219, 1981. doi: 10.1109/SFCS.1981.6.
• [Shah and Wainwright(2015)] N. B. Shah and M. Wainwright. Simple, robust and optimal ranking from pairwise comparisons. arXiv preprint arXiv:1512.08949, 2015.
• [Shah et al.(2017)Shah, Balakrishnan, Guntuboyina, and Wainright] N. B. Shah, S. Balakrishnan, A. Guntuboyina, and M. J. Wainright. Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Transactions on Information Theory, 63(2):934–959, 2017.
• [Suh et al.(2017)Suh, Tan, and Zhao] C. Suh, V. Tan, and R. Zhao. Adversarial top- ranking. IEEE Transactions on Information Theory, 63(4):2201–2225, 2017.
• [Valiant(1975)] Leslie G. Valiant. Parallelism in comparison problems. SIAM J. Comput., 4(3):348–355, 1975. doi: 10.1137/0204030.
• [Wauthier et al.(2013)Wauthier, M.Jordan, and Jojic] F. Wauthier, M.Jordan, and N. Jojic. Efficient ranking from pairwise comparisons. In Proceedings of the International Conference on Machine Learning (ICML), 2013.

## Appendix A Sorted Top-k in the Noiseless Case

In this section, we show upper and lower bounds on the sample complexity for solving sorted top- in the noiseless case.

First of all, it’s easy to observe that the sample complexity for solving sorted top- in 1 round is . For the upper bound, we just need to compare all pairs (there are of them). For the lower bound, first observe that we can wlog assume the algorithm is deterministic. Then if the algorithm uses fewer than comparisons, it misses the comparison between the best item and the second best item with positive probability and therefore the algorithm cannot even guarantee to solve top-1.

For more than 1 round, we show algorithms in Section A.1 and lower bounds in Section A.2.

### a.1 Algorithms

Our algorithmic results are stated in Corollary A.1 (for 2 rounds) and Corollary A.1 (for rounds). They are based on two sub-routines: Algorithm 2 and Algorithm 3. Both of them use the sorting algorithm in [Alon et al.(1986)Alon, Azar, and Vishkin] as a blackbox (Theorem A.1).

Algorithm 2 is used when is large (). In the first round, we pick a random set of size (call them “pivot items”) and partition the entire set into chunks by comparing all items to the pivot items. In the remaining rounds, we use the sorting algorithm in [Alon et al.(1986)Alon, Azar, and Vishkin] for each chunk that has top- items. We prove Algorithm 2 works in Lemma A.1.

Algorithm 3 is used when is small. Compared with Algorithm 2, we pick the pivot items more carefully in Algorithm 3. In the first round, we extend the result of [Braverman et al.(2016)Braverman, Mao, and Weinberg] (stated in Theorem 3 and Corollary 3) to find pivot items. In the second round, we partition the entire set into chunks by comparing all items to the pivot items. In the remaining rounds, we use the sorting algorithm in [Alon et al.(1986)Alon, Azar, and Vishkin] for each chunk that has top- items. We prove Algorithm 3 works in Lemma A.1.

We first provide the final statements of our algorithmic results for sorted top- in the noiseless case: There exists a -round algorithm solves sorted top- with comparisons in expectation.

There are two cases:

• When , run Algorithm 2 to find the sorted top-. This takes comparisons in expectation.

• When , run Algorithm 2 to find the sorted top- and then output sorted top-. This takes comparisons in expectation.

For , there exists an -round algorithm solves sorted top- with comparisons in expectation.

There are three cases:

• When , run Algorithm 2 to find the sorted top-. This takes comparisons in expectation.

• When : run Algorithm 3 to find the sorted top-. This takes comparisons in expectation.

• When , run Algorithm 3 to find the sorted top- and then output sorted top-. This takes comparisons in expectation.

Now we start to show two sub-routines: Algorithm 2 and Algorithm 3.

[[Alon et al.(1986)Alon, Azar, and Vishkin]] For any fixed , there exists an -round algorithm which sorts items with comparisons in expectation.

For and , Algorithm 2 is always correct and uses comparisons in expectation.

The correctness of the algorithm is easy to check. To prove the lemma, it suffices to bound the expected number of comparisons used by the algorithm. In the first round, the algorithm uses comparisons. From round 2 to round , by Theorem A.1, the algorithm uses comparisons in expectation. It suffices to prove that . Notice that and

’s are random variables depending on the randomness of the algorithm.

For , define such that is in . We have

 E[l∑i=1|Ni|1+1/(r−1)] =E[|Nl|1+1/(r−1)+l−1∑i=1|Ni|1+1/(r−1)] ≤E[|Ng(k)|1+1/(r−1)+k∑i=1|Ng(i)|1/(r−1)].

For each , we will upper bound for . We start by considering (the set of items that have ranks no worse than item and are put into the same partitioned set as item ). The size of this set is exactly . For ,

 E[|Ng(i)∩{j|j≤i}|β]≤n∑j=1(jβ−(j−1)β)(1−j/n)α≤∫n0βxβ−1(1−x/n)αdx.

When , we have

 ∫n0xβ−1(1−x/n)αdx=nα+1

When , we have

 ∫n0xβ−1(1−x/n)αdx=n2α+1−n2α+2=n2(α+2)(α+1)<(nα)2.

When , by concavity of for , we have

 ∫n0xβ−1(1−x/n)αdx=(∫n0(1−x/n)αdx)(∫n0x(1−x/n)αdx∫n0(1−x/n)αdx)β−1<(nα)β.

So for , we have

 E[|Ng(i)∩{j|j≤i}|β]≤β(nα)β.

For , by the concavity of we have

 E[|Ng(i)∩{j|j≤i}|β]≤E[|Ng(i)∩{j|j≤i}|]β≤(nα)β.

By symmetry, we can also get the same upper bound on . Therefore, for :

 E[|Ng(i)|β] ≤E[(|Ng(i)∩{j|j≤i}|+|Ng(i)∩{j|j≥i}|)β] ≤E[(2|Ng(i)∩{j|j≤i}|)β]+E[(2|Ng(i)∩{j|j≥i}|)β] ≤O((nα)β).

Notice that . To sum up, we get

 E[l∑i=1|Ni|1+1/(r−1)]=O((nα)1+1/(r−1)+k⋅(nα)1/(r−1))=O(n2/rk(r−1)/r).

Algorithm 5 in Appendix C.1.1 of [Braverman et al.(2016)Braverman, Mao, and Weinberg] can be easily extended to show the following theorem. In that algorithm, for their purpose, only for is explicitly computed after one round (denoted as in their pseudocode). However, it is not hard to see that ’s for all can be computed in the same way using the same set of comparisons. The only change is that the failure probability is multiplied by a factor of because of union bound.

[[Braverman et al.(2016)Braverman, Mao, and Weinberg]] There exists an 1-round algorithm with comparisons which outputs a list of item ’s for all such that with probability at least , , ’s rank is at most away from for some .

Using Theorem 3, we can get the following corollary. For any , there exists an 1-round algorithm with comparisons which outputs a list of ’s for all such that with probability at least , , ’s rank is at most