Simple Set Sketching

11/07/2022
by   Jakob Bæk Tejs Houen, et al.
0

Imagine handling collisions in a hash table by storing, in each cell, the bit-wise exclusive-or of the set of keys hashing there. This appears to be a terrible idea: For α n keys and n buckets, where α is constant, we expect that a constant fraction of the keys will be unrecoverable due to collisions. We show that if this collision resolution strategy is repeated three times independently the situation reverses: If α is below a threshold of ≈ 0.81 then we can recover the set of all inserted keys in linear time with high probability. Even though the description of our data structure is simple, its analysis is nontrivial. Our approach can be seen as a variant of the Invertible Bloom Filter (IBF) of Eppstein and Goodrich. While IBFs involve an explicit checksum per bucket to decide whether the bucket stores a single key, we exploit the idea of quotienting, namely that some bits of the key are implicit in the location where it is stored. We let those serve as an implicit checksum. These bits are not quite enough to ensure that no errors occur and the main technical challenge is to show that decoding can recover from these errors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2023

Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries

A dictionary data structure maintains a set of at most n keys from the u...
research
09/13/2018

A Self-Stabilizing Hashed Patricia Trie

While a lot of research in distributed computing has covered solutions f...
research
10/31/2021

On the Optimal Time/Space Tradeoff for Hash Tables

For nearly six decades, the central open question in the study of hash t...
research
08/30/2023

Optimal Non-Adaptive Cell Probe Dictionaries and Hashing

We present a simple and provably optimal non-adaptive cell probe data st...
research
06/13/2023

Invertible Bloom Lookup Tables with Less Memory and Randomness

In this work we study Invertible Bloom Lookup Tables (IBLTs) with small ...
research
01/01/2023

Time-Entanglement QKD: Secret Key Rates and Information Reconciliation Coding

In time entanglement-based quantum key distribution (QKD), Alice and Bob...
research
04/21/2023

Learned Monotone Minimal Perfect Hashing

A Monotone Minimal Perfect Hash Function (MMPHF) constructed on a set S ...

Please sign up or login with your details

Forgot password? Click here to reset