Coresets for Minimum Enclosing Balls over Sliding Windows

05/09/2019
by   Yanhao Wang, et al.
0

Coresets are important tools to generate concise summaries of massive datasets for approximate analysis. A coreset is a small subset of points extracted from the original point set such that certain geometric properties are preserved with provable guarantees. This paper investigates the problem of maintaining a coreset to preserve the minimum enclosing ball (MEB) for a sliding window of points that are continuously updated in a data stream. Although the problem has been extensively studied in batch and append-only streaming settings, no efficient sliding-window solution is available yet. In this work, we first introduce an algorithm, called AOMEB, to build a coreset for MEB in an append-only stream. AOMEB improves the practical performance of the state-of-the-art algorithm while having the same approximation ratio. Furthermore, using AOMEB as a building block, we propose two novel algorithms, namely SWMEB and SWMEB+, to maintain coresets for MEB over the sliding window with constant approximation ratios. The proposed algorithms also support coresets for MEB in a reproducing kernel Hilbert space (RKHS). Finally, extensive experiments on real-world and synthetic datasets demonstrate that SWMEB and SWMEB+ achieve speedups of up to four orders of magnitude over the state-of-the-art batch algorithm while providing coresets for MEB with rather small errors compared to the optimal ones.

READ FULL TEXT

page 9

page 10

page 11

page 15

page 17

page 23

page 24

page 25

research
10/29/2021

Improved Sliding Window Algorithms for Clustering and Coverage via Bucketing-Based Sketches

Streaming computation plays an important role in large-scale data analys...
research
06/10/2019

Parallel Streaming Random Sampling

This paper investigates parallel random sampling from a potentially-unen...
research
10/16/2020

Sliding-Window QPS (SW-QPS): A Perfect Parallel Iterative Switching Algorithm for Input-Queued Switches

In this work, we first propose a parallel batch switching algorithm call...
research
11/07/2017

SWOOP: Top-k Similarity Joins over Set Streams

We provide efficient support for applications that aim to continuously f...
research
06/10/2020

Sliding Window Algorithms for k-Clustering Problems

The sliding window model of computation captures scenarios in which data...
research
09/26/2017

SURGE: Continuous Detection of Bursty Regions Over a Stream of Spatial Objects

With the proliferation of mobile devices and location-based services, co...
research
01/07/2022

k-Center Clustering with Outliers in Sliding Windows

Metric k-center clustering is a fundamental unsupervised learning primit...

Please sign up or login with your details

Forgot password? Click here to reset