Parallel Streaming Random Sampling

06/10/2019
by   Kanat Tangwongsan, et al.
0

This paper investigates parallel random sampling from a potentially-unending data stream whose elements are revealed in a series of element sequences (minibatches). While sampling from a stream was extensively studied sequentially, not much has been explored in the parallel context, with prior parallel random-sampling algorithms focusing on the static batch model. We present parallel algorithms for minibatch-stream sampling in two settings: (1) sliding window, which draws samples from a prespecified number of most-recently observed elements, and (2) infinite window, which draws samples from all the elements received. Our algorithms are computationally and memory efficient: their work matches the fastest sequential counterpart, their parallel depth is small (polylogarithmic), and their memory usage matches the best known.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2019

Coresets for Minimum Enclosing Balls over Sliding Windows

Coresets are important tools to generate concise summaries of massive da...
research
10/29/2018

Distinct Sampling on Streaming Data with Near-Duplicates

In this paper we study how to perform distinct sampling in the streaming...
research
07/20/2023

Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and Insertions (Extended Version)

Sliding-window aggregation is a foundational stream processing primitive...
research
11/16/2021

Sequential Unequal Probability Sampling For Stream Population

A new unequal probability sampling method is proposed. This method is se...
research
11/17/2021

Stream Sampling with Immediate Decision

The manuscript introduces a method to select a random sample from a stre...
research
03/02/2022

Pattern Recognition and Event Detection on IoT Data-streams

Big data streams are possibly one of the most essential underlying notio...
research
06/08/2023

Analysis of Knuth's Sampling Algorithm D and D'

In this research paper, we address the Distinct Elements estimation prob...

Please sign up or login with your details

Forgot password? Click here to reset