Socially Fair k-Means Clustering

06/17/2020
by   Mehrdad Ghadiri, et al.
0

We show that the popular k-means clustering algorithm (Lloyd's heuristic), used for a variety of scientific data, can result in outcomes that are unfavorable to subgroups of data (e.g., demographic groups). Such biased clusterings can have deleterious implications for human-centric applications such as resource allocation. We present a fair k-means objective and algorithm to choose cluster centers that provide equitable costs for different groups. The algorithm, Fair-Lloyd, is a modification of Lloyd's heuristic for k-means, inheriting its simplicity, efficiency, and stability. In comparison with standard Lloyd's, we find that on benchmark datasets, Fair-Lloyd exhibits unbiased performance by ensuring that all groups have equal costs in the output k-clustering, while incurring a negligible increase in running time, thus making it a viable fair option wherever k-means is currently used.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2022

Socially Fair Center-based and Linear Subspace Clustering

Center-based clustering (e.g., k-means, k-medians) and clustering using ...
research
02/06/2023

Fair Minimum Representation Clustering

Clustering is an unsupervised learning task that aims to partition data ...
research
05/29/2021

A Stochastic Alternating Balance k-Means Algorithm for Fair Clustering

In the application of data clustering to human-centric decision-making s...
research
06/22/2022

Constant-Factor Approximation Algorithms for Socially Fair k-Clustering

We study approximation algorithms for the socially fair (ℓ_p, k)-cluster...
research
10/04/2022

Robust Fair Clustering: A Novel Fairness Attack and Defense Framework

Clustering algorithms are widely used in many societal resource allocati...
research
06/23/2014

Further heuristics for k-means: The merge-and-split heuristic and the (k,l)-means

Finding the optimal k-means clustering is NP-hard in general and many he...
research
02/02/2022

A quantitative method for benchmarking fair income distribution

Concern about income inequality has become prominent in public discourse...

Please sign up or login with your details

Forgot password? Click here to reset