Greedy Strategy Works for Clustering with Outliers and Coresets Construction

01/24/2019
by   Hu Ding, et al.
0

We study the problems of clustering with outliers in high dimension. Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithms with low complexities for the problems. Our idea is inspired by the greedy method, Gonzalez's algorithm, for solving the problem of ordinary k-center clustering. Based on some novel observations, we show that this greedy strategy actually can handle k-center/median/means clustering with outliers efficiently, in terms of qualities and complexities. We further show that the greedy approach yields small coreset for the problem in doubling metrics, so as to reduce the time complexity significantly. Moreover, a by-product is that the coreset construction can be applied to speedup the popular density-based clustering approach DBSCAN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2019

Greedy Strategy Works for k-Center Clustering with Outliers and Coreset Construction

We study the problem of k-center clustering with outliers in arbitrary m...
research
01/07/2023

Randomized Greedy Algorithms and Composable Coreset for k-Center Clustering with Outliers

In this paper, we study the problem of k-center clustering with outliers...
research
02/27/2020

The Effectiveness of Johnson-Lindenstrauss Transform for High Dimensional Optimization with Outliers

Johnson-Lindenstrauss (JL) Transform is one of the most popular methods ...
research
02/27/2020

On Metric DBSCAN with Low Doubling Dimension

The density based clustering method Density-Based Spatial Clustering of ...
research
04/25/2018

Bi-criteria Approximation Algorithms for Minimum Enclosing Ball and k-Center Clustering with Outliers

Motivated by the arising realistic issues in big data, the problem of Mi...
research
05/24/2019

A Practical Framework for Solving Center-Based Clustering with Outliers

Clustering has many important applications in computer science, but real...
research
03/05/2020

Fast Noise Removal for k-Means Clustering

This paper considers k-means clustering in the presence of noise. It is ...

Please sign up or login with your details

Forgot password? Click here to reset