Random projections and sampling algorithms for clustering of high-dimensional polygonal curves

07/16/2019
by   Stefan Meintrup, et al.
0

We study the center and median clustering problems for high-dimensional polygonal curves with finite but unbounded complexity. We tackle the computational issue that arises from the high number of dimensions by defining a Johnson-Lindenstrauss projection for polygonal curves. We analyze the resulting error in terms of the Fréchet distance, which is a natural dissimilarity measure for curves. Our algorithms for the median clustering achieve sublinear dependency on the number of input curves via subsampling. For the center clustering we utilize Buchin et al. (2019a) algorithm that achieves linear running-time in the number of input curves. We evaluate our results empirically utilizing a fast, CUDA-parallelized variant of the Alt and Godau algorithm for the Fréchet distance. Our experiments show that our clustering algorithms have fast and accurate practical implementations that yield meaningful results on real world data from various physical domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2021

Coresets for (k, ℓ)-Median Clustering under the Fréchet Distance

We present an algorithm for computing ϵ-coresets for (k, ℓ)-median clust...
research
09/03/2020

Approximating (k,ℓ)-Median Clustering for Polygonal Curves

In 2015, Driemel, Krivošija and Sohler introduced the (k,ℓ)-median probl...
research
12/07/2012

Similarity of Polygonal Curves in the Presence of Outliers

The Fréchet distance is a well studied and commonly used measure to capt...
research
02/21/2019

On the hardness of computing an average curve

We study the complexity of clustering curves under k-median and k-center...
research
08/28/2023

Finding Complex Patterns in Trajectory Data via Geometric Set Cover

Clustering trajectories is a central challenge when confronted with larg...
research
07/15/2022

Random projections for curves in high dimensions

Modern time series analysis requires the ability to handle datasets that...
research
11/06/2018

High Dimensional Clustering with r-nets

Clustering, a fundamental task in data science and machine learning, gro...

Please sign up or login with your details

Forgot password? Click here to reset