Scalable Fair Clustering

02/10/2019
by   Arturs Backurs, et al.
0

We study the fair variant of the classic k-median problem introduced by Chierichetti et al. [2017]. In the standard k-median problem, given an input pointset P, the goal is to find k centers C and assign each input point to one of the centers in C such that the average distance of points to their cluster center is minimized. In the fair variant of k-median, the points are colored, and the goal is to minimize the same average distance objective while ensuring that all clusters have an "approximately equal" number of points of each color. Chierichetti et al. proposed a two-phase algorithm for fair k-clustering. In the first step, the pointset is partitioned into subsets called fairlets that satisfy the fairness requirement and approximately preserve the k-median objective. In the second step, fairlets are merged into k clusters by one of the existing k-median algorithms. The running time of this algorithm is dominated by the first step, which takes super-quadratic time. In this paper, we present a practical approximate fairlet decomposition algorithm that runs in nearly linear time. Our algorithm additionally allows for finer control over the balance of resulting clusters than the original work. We complement our theoretical bounds with empirical evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2020

(Individual) Fairness for k-Clustering

We give a local search based algorithm for k-median (k-means) clustering...
research
06/20/2019

Coresets for Clustering with Fairness Constraints

In a recent work, Chierichetti et al. studied the following "fair" varia...
research
10/26/2020

KFC: A Scalable Approximation Algorithm for k-center Fair Clustering

In this paper, we study the problem of fair clustering on the k-center o...
research
06/23/2021

Better Algorithms for Individually Fair k-Clustering

We study data clustering problems with ℓ_p-norm objectives (e.g. k-Media...
research
12/28/2018

Fair Coresets and Streaming Algorithms for Fair k-Means Clustering

We study fair clustering problems as proposed by Chierichetti et al. Her...
research
05/19/2011

Hierarchical Recursive Running Median

To date, the histogram-based running median filter of Perreault and Hébe...
research
05/09/2019

Proportionally Fair Clustering

We extend the fair machine learning literature by considering the proble...

Please sign up or login with your details

Forgot password? Click here to reset