A Notion of Individual Fairness for Clustering

by   Matthäus Kleindessner, et al.

A common distinction in fair machine learning, in particular in fair classification, is between group fairness and individual fairness. In the context of clustering, group fairness has been studied extensively in recent years; however, individual fairness for clustering has hardly been explored. In this paper, we propose a natural notion of individual fairness for clustering. Our notion asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster. We study several questions related to our proposed notion of individual fairness. On the negative side, we show that deciding whether a given data set allows for such an individually fair clustering in general is NP-hard. On the positive side, for the special case of a data set lying on the real line, we propose an efficient dynamic programming approach to find an individually fair clustering. For general data sets, we investigate heuristics aimed at minimizing the number of individual fairness violations and compare them to standard clustering approaches on real data sets.


page 1

page 2

page 3

page 4


Individual Preference Stability for Clustering

In this paper, we propose a natural notion of individual preference (IP)...

Fair Labeled Clustering

Numerous algorithms have been produced for the fundamental problem of cl...

On the Apparent Conflict Between Individual and Group Fairness

A distinction has been drawn in fair machine learning research between `...

A New Notion of Individually Fair Clustering: α-Equitable k-Center

Clustering is a fundamental problem in unsupervised machine learning, an...

Guarantees for Spectral Clustering with Fairness Constraints

Given the widespread popularity of spectral clustering (SC) for partitio...

Interpretable Assessment of Fairness During Model Evaluation

For companies developing products or algorithms, it is important to unde...

Fair k-Center Clustering for Data Summarization

In data summarization we want to choose k prototypes in order to summari...