Cluster-level Group Representativity Fairness in k-means Clustering

12/29/2022
by   Stanley Simoes, et al.
0

There has been much interest recently in developing fair clustering algorithms that seek to do justice to the representation of groups defined along sensitive attributes such as race and gender. We observe that clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters. We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms such as k-means, where we focus on mitigating the unfairness experienced by the most-disadvantaged group within each cluster. Our method uses an iterative optimisation paradigm whereby an initial cluster assignment is modified by reassigning objects to clusters such that the worst-off sensitive group within each cluster is benefitted. We demonstrate the effectiveness of our method through extensive empirical evaluations over a novel evaluation metric on real-world datasets. Specifically, we show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2019

Fairness in Clustering with Multiple Sensitive Attributes

A clustering may be considered as fair on pre-specified sensitive attrib...
research
05/04/2022

Exploring Rawlsian Fairness for K-Means Clustering

We conduct an exploratory study that looks at incorporating John Rawls' ...
research
02/08/2021

Learning to Generate Fair Clusters from Demonstrations

Fair clustering is the process of grouping similar entities together, wh...
research
03/09/2020

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heur...
research
07/05/2016

Algorithms for Generalized Cluster-wise Linear Regression

Cluster-wise linear regression (CLR), a clustering problem intertwined w...
research
05/06/2020

A Bernoulli Mixture Model to Understand and Predict Children Longitudinal Wheezing Patterns

In this research, we estimate that around 27.99(±2.15)% of the populatio...
research
02/07/2020

A novel initialisation based on hospital-resident assignment for the k-modes algorithm

This paper presents a new way of selecting an initial solution for the k...

Please sign up or login with your details

Forgot password? Click here to reset