On the cost of essentially fair clusterings

11/26/2018
by   Ioana O. Bercea, et al.
0

Clustering is a fundamental tool in data mining. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. However, this process may harm protected (minority) classes if the clustering algorithm does not adequately represent them in desirable clusters -- especially if the data is already biased. At NIPS 2017, Chierichetti et al. proposed a model for fair clustering requiring the representation in each cluster to (approximately) preserve the global fraction of each protected class. Restricting to two protected classes, they developed both a 4-approximation for the fair k-center problem and a O(t)-approximation for the fair k-median problem, where t is a parameter for the fairness model. For multiple protected classes, the best known result is a 14-approximation for fair k-center. We extend and improve the known results. Firstly, we give a 5-approximation for the fair k-center problem with multiple protected classes. Secondly, we propose a relaxed fairness notion under which we can give bicriteria constant-factor approximations for all of the classical clustering objectives k-center, k-supplier, k-median, k-means and facility location. The latter approximations are achieved by a framework that takes an arbitrary existing unfair (integral) solution and a fair (fractional) LP solution and combines them into an essentially fair clustering with a weakly supervised rounding scheme. In this way, a fair clustering can be established belatedly, in a situation where the centers are already fixed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2021

Improved Approximation Algorithms for Individually Fair Clustering

We consider the k-clustering problem with ℓ_p-norm cost, which includes ...
research
02/15/2018

Fair Clustering Through Fairlets

We study the question of fair clustering under the disparate impact doc...
research
02/18/2020

Fair Clustering with Multiple Colors

A fair clustering instance is given a data set A in which every point is...
research
05/31/2019

Principal Fairness: Removing Bias via Projections

Reducing hidden bias in the data and ensuring fairness in algorithmic da...
research
07/14/2020

A Pairwise Fair and Community-preserving Approach to k-Center Clustering

Clustering is a foundational problem in machine learning with numerous a...
research
09/02/2023

Approximating Fair k-Min-Sum-Radii in ℝ^d

The k-center problem is a classical clustering problem in which one is a...
research
05/09/2019

Proportionally Fair Clustering

We extend the fair machine learning literature by considering the proble...

Please sign up or login with your details

Forgot password? Click here to reset