Optimal Round and Sample-Size Complexity for Partitioning in Parallel Sorting

04/10/2022
by   Wentao Yang, et al.
0

State-of-the-art parallel sorting algorithms for distributed-memory architectures are based on computing a balanced partitioning via sampling and histogramming. By finding samples that partition the sorted keys into evenly-sized chunks, these algorithms minimize the number of communication rounds required. Histogramming (computing positions of samples) guides sampling, enabling a decrease in the overall number of samples collected. We derive lower and upper bounds on the number of sampling/histogramming rounds required to compute a balanced partitioning. We improve on prior results to demonstrate that when using p processors/parts, O(log^* p) rounds with O(p/log^* p) samples per round suffice. We match that with a lower bound that shows any algorithm requires at least Ω(log^* p) rounds with O(p) samples per round. Additionally, we prove the Ω(p log p) samples lower bound for one round, showing the optimality of sample sort in this case. To derive the lower bound, we propose a hard randomized input distribution and apply classical results from the distribution theory of runs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2020

Quantum Distributed Complexity of Set Disjointness on a Line

Given x,y∈{0,1}^n, Set Disjointness consists in deciding whether x_i=y_i...
research
03/03/2018

Histogram Sort with Sampling

To minimize data movement, state-of-the-art parallel sorting algorithms ...
research
03/20/2021

Round and Communication Balanced Protocols for Oblivious Evaluation of Finite State Machines

We propose protocols for obliviously evaluating finite-state machines, i...
research
04/18/2022

Sleeping is Superefficient: MIS in Exponentially Better Awake Complexity

Maximal Independent Set (MIS) is one of the central and most well-studie...
research
09/24/2020

Algorithms for a Topology-aware Massively Parallel Computation Model

Most of the prior work in massively parallel data processing assumes hom...
research
05/05/2021

The Complexity of Symmetry Breaking in Massive Graphs

The goal of this paper is to understand the complexity of symmetry break...
research
10/22/2021

The Log-Interleave Bound: Towards the Unification of Sorting and the BST Model

We study the connections between sorting and the binary search tree mode...

Please sign up or login with your details

Forgot password? Click here to reset