
Ribbon filter: practically smaller than Bloom and Xor
Filter data structures overapproximate a set of hashable keys, i.e. set...
read it

Don't Thrash: How to Cache Your Hash on Flash
This paper presents new alternatives to the wellknown Bloom filter data...
read it

An Overview of Cryptographic Accumulators
This paper is a primer on cryptographic accumulators and how to apply th...
read it

SpaceEfficient Data Structures for Lattices
A lattice is a partiallyordered set in which every pair of elements has...
read it

Optimal Las Vegas Approximate Near Neighbors in ℓ_p
We show that approximate near neighbor search in high dimensions can be ...
read it

Derandomization of Cell Sampling
Since 1989, the best known lower bound on static data structures was Sie...
read it

MetaLearning Neural Bloom Filters
There has been a recent trend in training neural networks to replace dat...
read it
Fast Succinct Retrieval and Approximate Membership using Ribbon
A retrieval data structure for a static function f:S→{0,1}^r supports queries that return f(x) for any x ∈ S. Retrieval data structures can be used to implement a static approximate membership query data structure (AMQ) (i.e., a Bloom filter alternative) with false positive rate 2^r. The informationtheoretic lower bound for both tasks is rS bits. While succinct theoretical constructions using (1+o(1))rS bits were known, these could not achieve very small overheads in practice because they have an unfavorable spacetime tradeoff hidden in the asymptotic costs or because small overheads would only be reached for physically impossible input sizes. With bumped ribbon retrieval (BuRR), we present the first practical succinct retrieval data structure. In an extensive experimental evaluation BuRR achieves space overheads well below 1 % while being faster than most previously used retrieval data structures (typically with space overheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead ≥ 44 %). This efficiency, including favorable constants, stems from a combination of simplicity, word parallelism, and high locality. We additionally describe homogeneous ribbon filter AMQs, which are even simpler and faster at the price of slightly larger space overhead.
READ FULL TEXT
Comments
There are no comments yet.