RAMBO: Repeated And Merged Bloom Filter for Multiple Set Membership Testing (MSMT) in Sub-linear time

10/07/2019
by   Gaurav Gupta, et al.
0

Approximate set membership is a common problem with wide applications in databases, networking, and search. Given a set S and a query q, the task is to determine whether q in S. The Bloom Filter (BF) is a popular data structure for approximate membership testing due to its simplicity. In particular, a BF consists of a bit array that can be incrementally updated. A related problem concerning this paper is the Multiple Set Membership Testing (MSMT) problem. Here we are given K different sets, and for any given query q the goal is the find all of the sets containing the query element. Trivially, a multiple set membership instance can be reduced to K membership testing instances, each with the same q, leading to O(K) query time. A simple array of Bloom Filters can achieve that. In this paper, we show the first non-trivial data-structure for streaming keys, RAMBO (Repeated And Merged Bloom Filter) that achieves expected O(sqrt(K) logK) query time with an additional worst case memory cost factor of O(logK) than the array of Bloom Filters. The proposed data-structure is simply a count-min sketch arrangement of Bloom Filters and retains all its favorable properties. We replace the addition operation with a set union and the minimum operation with a set intersection during estimation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2019

Shed More Light on Bloom Filter's Variants

Bloom Filter is a probabilistic membership data structure and it is exce...
research
05/05/2020

Conditional Cuckoo Filters

Bloom filters, cuckoo filters, and other approximate set membership sket...
research
08/25/2023

ChainedFilter: Combining Membership Filters by Chain Rule

Membership (membership query / membership testing) is a fundamental prob...
research
12/16/2019

Matrix Bloom Filter: An Efficient Probabilistic Data Structure for 2-tuple Batch Lookup

With the growing scale of big data, probabilistic structures receive inc...
research
09/07/2021

P3FA: Unified Unicast/Multicast Forwarding with Low Egress Diversities

Multicast is an efficient way to realize one-to-many group communication...
research
11/19/2019

Concurrent Expandable AMQs on the Basis of Quotient Filters

A quotient filter is a cache efficient AMQ data structure. Depending on ...
research
05/15/2020

Low Complexity Sequential Search with Measurement Dependent Noise

This paper considers a target localization problem where at any given ti...

Please sign up or login with your details

Forgot password? Click here to reset