Partitioned Learned Bloom Filter

06/05/2020
by   Kapil Vaidya, et al.
0

Bloom filters are space-efficient probabilistic data structures that are used to test whether an element is a member of a set, and may return false positives. Recently, variations referred to as learned Bloom filters were developed that can provide improved performance in terms of the rate of false positives, by using a learned model for the represented set. However, previous methods for learned Bloom filters do not take full advantage of the learned model. Here we show how to frame the problem of optimal model utilization as an optimization problem, and using our framework derive algorithms that can achieve near-optimal performance in many cases. Experimental results from both simulated and real-world datasets show significant performance improvements from our optimization approach over both the original learned Bloom filter constructions and previously proposed heuristic improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2021

Support Optimality and Adaptive Cuckoo Filters

Filters (such as Bloom Filters) are data structures that speed up networ...
research
10/21/2019

Adaptive Learned Bloom Filter (Ada-BF): Efficient Utilization of the Classifier

Recent work suggests improving the performance of Bloom filter by incorp...
research
06/27/2020

Optimizing Cuckoo Filter for high burst tolerance,low latency, and high throughput

In this paper, we present an implementation of a cuckoo filter for membe...
research
06/30/2022

Proteus: A Self-Designing Range Filter

We introduce Proteus, a novel self-designing approximate range filter, w...
research
08/05/2022

Compressing (Multidimensional) Learned Bloom Filters

Bloom filters are widely used data structures that compactly represent s...
research
05/11/2022

Raw Filtering of JSON Data on FPGAs

Many Big Data applications include the processing of data streams on sem...
research
09/24/2020

A Case for Partitioned Bloom Filters

In a partitioned Bloom Filter the m bit vector is split into k disjoint ...

Please sign up or login with your details

Forgot password? Click here to reset