A Pairwise Fair and Community-preserving Approach to k-Center Clustering

07/14/2020
by   Brian Brubach, et al.
12

Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a backend for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing k-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical k-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2019

Proportionally Fair Clustering

We extend the fair machine learning literature by considering the proble...
research
11/26/2018

On the cost of essentially fair clusterings

Clustering is a fundamental tool in data mining. It partitions points in...
research
06/09/2021

A New Notion of Individually Fair Clustering: α-Equitable k-Center

Clustering is a fundamental problem in unsupervised machine learning, an...
research
03/02/2021

Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with Stochastic Pairwise Constraints

Metric clustering is fundamental in areas ranging from Combinatorial Opt...
research
04/27/2023

Proportionally Representative Clustering

In recent years, there has been a surge in effort to formalize notions o...
research
07/21/2023

A Fair and Memory/Time-efficient Hashmap

There is a large amount of work constructing hashmaps to minimize the nu...
research
04/28/2023

How to address monotonicity for model risk management?

In this paper, we study the problem of establishing the accountability a...

Please sign up or login with your details

Forgot password? Click here to reset