Greedy Strategy Works for Clustering with Outliers and Coresets Construction

01/24/2019
by   Hu Ding, et al.
0

We study the problems of clustering with outliers in high dimension. Though a number of methods have been developed in the past decades, it is still quite challenging to design quality guaranteed algorithms with low complexities for the problems. Our idea is inspired by the greedy method, Gonzalez's algorithm, for solving the problem of ordinary k-center clustering. Based on some novel observations, we show that this greedy strategy actually can handle k-center/median/means clustering with outliers efficiently, in terms of qualities and complexities. We further show that the greedy approach yields small coreset for the problem in doubling metrics, so as to reduce the time complexity significantly. Moreover, a by-product is that the coreset construction can be applied to speedup the popular density-based clustering approach DBSCAN.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset