Subtrajectory Clustering: Finding Set Covers for Set Systems of Subcurves

by   Hugo A. Akitaya, et al.

We study subtrajectory clustering under the Fréchet distance. Given one or more trajectories, the task is to split the trajectories into several parts, such that the parts have a good clustering structure. We approach this problem via a new set cover formulation, which we think provides a natural formalization of the problem as it is studied in many applications. Given a polygonal curve P with n vertices in fixed dimension, integers k, ℓ≥ 1, and a real value Δ > 0, the goal is to find k center curves of complexity at most ℓ such that every point on P is covered by a subtrajectory that has small Fréchet distance to one of the k center curves (≤Δ). In many application scenarios, one is interested in finding clusters of small complexity, which is controlled by the parameter ℓ. Our main result is a tri-criterial approximation algorithm: if there exists a solution for given parameters k, ℓ, and Δ, then our algorithm finds a set of k' center curves of complexity at most ℓ' with covering radius Δ' with k' ∈ O( k ℓ^2 log (k ℓ)), ℓ'≤ 2ℓ, and Δ'≤ 19 Δ. Moreover, within these approximation bounds, we can minimize k while keeping the other parameters fixed. If ℓ is a constant independent of n, then, the approximation factor for the number of clusters k is O(log k) and the approximation factor for the radius Δ is constant. In this case, the algorithm has expected running time in Õ( k m^2 + mn) and uses space in O(n+m), where m=⌈L/Δ⌉ and L is the total arclength of the curve P. For the important case of clustering with line segments (ℓ=2) we obtain bi-criteria approximation algorithms, where the approximation criteria are the number of clusters and the radius of the clustering.


Clustering with Neighborhoods

In the standard planar k-center clustering problem, one is given a set P...

Faster Approximate Covering of Subcurves under the Fréchet Distance

Subtrajectory clustering is an important variant of the trajectory clust...

A Composable Coreset for k-Center in Doubling Metrics

A set of points P in a metric space and a constant integer k are given. ...

Finding Complex Patterns in Trajectory Data via Geometric Set Cover

Clustering trajectories is a central challenge when confronted with larg...

Explainable Clustering via Exemplars: Complexity and Efficient Approximation Algorithms

Explainable AI (XAI) is an important developing area but remains relativ...

Counting points on hyperelliptic curves with explicit real multiplication in arbitrary genus

We present a probabilistic Las Vegas algorithm for computing the local z...

Improved Complexity Bounds for Counting Points on Hyperelliptic Curves

We present a probabilistic Las Vegas algorithm for computing the local z...

Please sign up or login with your details

Forgot password? Click here to reset