Achieving anonymity via weak lower bound constraints for k-median and k-means

09/07/2020
by   Anna Arutyunova, et al.
0

We study k-clustering problems with lower bounds, including k-median and k-means clustering with lower bounds. In addition to the point set P and the number of centers k, a k-clustering problem with (uniform) lower bounds gets a number B. The solution space is restricted to clusterings where every cluster has at least B points. We demonstrate how to approximate k-median with lower bounds via a reduction to facility location with lower bounds, for which O(1)-approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give an 8-approximation for k-median clustering with weak lower bounds and an O(1)-approximation for k-means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to 2 (or, if we allow fractional assignments, to 1+ϵ). This also leads to the first bicritera approximation algorithm for k-means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in n and k (and d for the Euclidean variants considered).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

Towards Optimal Lower Bounds for k-median and k-means Coresets

Given a set of points in a metric space, the (k,z)-clustering problem co...
research
12/12/2012

Optimal Time Bounds for Approximate Clustering

Clustering is a fundamental problem in unsupervised learning, and has be...
research
08/09/2019

Unexpected Effects of Online K-means Clustering

In this paper we study k-means clustering in the online setting. In the ...
research
08/19/2013

A balanced k-means algorithm for weighted point sets

The classical k-means algorithm for partitioning n points in R^d into k ...
research
07/01/2021

On Variants of Facility Location Problem with Outliers

In this work, we study the extension of two variants of the facility loc...
research
07/11/2019

Analysis of Ward's Method

We study Ward's method for the hierarchical k-means problem. This popula...
research
09/19/2018

Improved Bounds for the Traveling Salesman Problem with Neighborhoods on Uniform Disks

Given a set of n disks of radius R in the Euclidean plane, the Traveling...

Please sign up or login with your details

Forgot password? Click here to reset