Upper Tail Analysis of Bucket Sort and Random Tries

02/24/2020
by   Ioana O. Bercea, et al.
0

Bucket Sort is known to run in expected linear time when the input keys are distributed independently and uniformly at random in the interval [0,1). The analysis holds even when a quadratic time algorithm is used to sort the keys in each bucket. We show how to obtain linear time guarantees on the running time of Bucket Sort that hold with very high probability. Specifically, we investigate the asymptotic behavior of the exponent in the upper tail probability of the running time of Bucket Sort. We consider large additive deviations from the expectation, of the form cn for large enough (constant) c, where n is the number of keys that are sorted. Our analysis shows a profound difference between variants of Bucket Sort that use a quadratic time algorithm within each bucket and variants that use a Θ(blog b) time algorithm for sorting b keys in a bucket. When a quadratic time algorithm is used to sort the keys in a bucket, the probability that Bucket Sort takes cn more time than expected is exponential in Θ(√(n)log n). When a Θ(blog b) algorithm is used to sort the keys in a bucket, the exponent becomes Θ(n). We prove this latter theorem by showing an upper bound on the tail of a random variable defined on tries, a result which we believe is of independent interest. This result also enables us to analyze the upper tail probability of a well-studied trie parameter, the external path length, and show that the probability that it deviates from its expected value by an additive factor of cn is exponential in Θ(n).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2021

A Linear-Time n^0.4-Approximation for Longest Common Subsequence

We consider the classic problem of computing the Longest Common Subseque...
research
11/04/2021

Finding All Leftmost Separators of Size ≤ k

We define a notion called leftmost separator of size at most k. A leftmo...
research
07/20/2019

The Generalized Trust Region Subproblem: solution complexity and convex hull results

We consider the Generalized Trust Region Subproblem (GTRS) of minimizing...
research
10/28/2021

Approximating the Arboricity in Sublinear Time

We consider the problem of approximating the arboricity of a graph G= (V...
research
04/30/2018

A Subquadratic Algorithm for 3XOR

Given a set X of n binary words of equal length w, the 3XOR problem asks...
research
01/13/2021

A Tail Estimate with Exponential Decay for the Randomized Incremental Construction of Search Structures

We revisit the randomized incremental construction of the Trapezoidal Se...
research
02/16/2021

Large deviations for the largest eigenvalue of Gaussian networks with constant average degree

Large deviation behavior of the largest eigenvalue λ_1 of Gaussian netwo...

Please sign up or login with your details

Forgot password? Click here to reset