A Hash Table Without Hash Functions, and How to Get the Most Out of Your Random Bits

09/13/2022
by   William Kuszmaul, et al.
0

This paper considers the basic question of how strong of a probabilistic guarantee can a hash table, storing n (1 + Θ(1)) log n-bit key/value pairs, offer? Past work on this question has been bottlenecked by limitations of the known families of hash functions: The only hash tables to achieve failure probabilities less than 1 / 2^ n require access to fully-random hash functions – if the same hash tables are implemented using the known explicit families of hash functions, their failure probabilities become 1 / (n). To get around these obstacles, we show how to construct a randomized data structure that has the same guarantees as a hash table, but that avoids the direct use of hash functions. Building on this, we are able to construct a hash table using O(n) random bits that achieves failure probability 1 / n^n^1 - ϵ for an arbitrary positive constant ϵ. In fact, we show that this guarantee can even be achieved by a succinct dictionary, that is, by a dictionary that uses space within a 1 + o(1) factor of the information-theoretic optimum. Finally we also construct a succinct hash table whose probabilistic guarantees fall on a different extreme, offering a failure probability of 1 / (n) while using only Õ(log n) random bits. This latter result matches (up to low-order terms) a guarantee previously achieved by Dietzfelbinger et al., but with increased space efficiency and with several surprising technical components.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2021

All-Purpose Hashing

Despite being one of the oldest data structures in computer science, has...
research
10/31/2021

On the Optimal Time/Space Tradeoff for Hash Tables

For nearly six decades, the central open question in the study of hash t...
research
06/13/2023

Invertible Bloom Lookup Tables with Less Memory and Randomness

In this work we study Invertible Bloom Lookup Tables (IBLTs) with small ...
research
02/06/2023

Storing a Trie with Compact and Predictable Space

This paper proposed a storing approach for trie structures, called coord...
research
08/29/2022

A Probabilistic Model Revealing Shortcomings in Lua's Hybrid Tables

Lua (Ierusalimschy et al., 1996) is a well-known scripting language, pop...
research
03/22/2015

Construction of FuzzyFind Dictionary using Golay Coding Transformation for Searching Applications

Searching through a large volume of data is very critical for companies,...
research
07/03/2021

When Are Learned Models Better Than Hash Functions?

In this work, we aim to study when learned models are better hash functi...

Please sign up or login with your details

Forgot password? Click here to reset