Power of d Choices with Simple Tabulation

04/25/2018
by   Anders Aamand, et al.
0

Suppose that we are to place m balls into n bins sequentially using the d-choice paradigm: For each ball we are given a choice of d bins, according to d hash functions h_1,...,h_d and we place the ball in the least loaded of these bins breaking ties arbitrarily. Our interest is in the number of balls in the fullest bin after all m balls have been placed. Azar et al. [STOC'94] proved that when m=O(n) and when the hash functions are fully random the maximum load is at most n / d+O(1) whp (i.e. with probability 1-O(n^-γ) for any choice of γ). In this paper we suppose that the h_1,...,h_d are simple tabulation hash functions. Generalising a result by Dahlgaard et al [SODA'16] we show that for an arbitrary constant d≥ 2 the maximum load is O( n) whp, and that expected maximum load is at most n/ d+O(1). We further show that by using a simple tie-breaking algorithm introduced by Vöcking [J.ACM'03] the expected maximum load drops to n/dφ_d+O(1) where φ_d is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those of the fully random setting. The analysis by Dahlgaard et al. relies on a proof by Pătraşcu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We need here a generalisation to d>2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by Pătraşcu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset