Generalized Reductions: Making any Hierarchical Clustering Fair and Balanced with Low Cost

05/27/2022
by   Marina Knittel, et al.
10

Clustering is a fundamental building block of modern statistical analysis pipelines. Fair clustering has seen much attention from the machine learning community in recent years. We are some of the first to study fairness in the context of hierarchical clustering, after the results of Ahmadian et al. from NeurIPS in 2020. We evaluate our results using Dasgupta's cost function, perhaps one of the most prevalent theoretical metrics for hierarchical clustering evaluation. Our work vastly improves the previous O(n^5/6polylog(n)) fair approximation for cost to a near polylogarithmic O(n^δ polylog(n)) fair approximation for any constant δ∈(0,1). This result establishes a cost-fairness tradeoff and extends to broader fairness constraints than the previous work. We also show how to alter existing hierarchical clusterings to guarantee fairness and cluster balance across any level in the hierarchy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2020

Fair Hierarchical Clustering

As machine learning has become more prevalent, researchers have begun to...
research
10/26/2020

KFC: A Scalable Approximation Algorithm for k-center Fair Clustering

In this paper, we study the problem of fair clustering on the k-center o...
research
05/27/2023

Fair Clustering via Hierarchical Fair-Dirichlet Process

The advent of ML-driven decision-making and policy formation has led to ...
research
02/22/2023

Improved Coresets for Clustering with Capacity and Fairness Constraints

We study coresets for clustering with capacity and fairness constraints....
research
02/06/2021

Promoting Fair Proposers, Fair Responders or Both? Cost-Efficient Interference in the Spatial Ultimatum Game

Institutions and investors face the constant challenge of making accurat...
research
05/27/2022

Prototype Based Classification from Hierarchy to Fairness

Artificial neural nets can represent and classify many types of data but...
research
06/09/2022

Improved Approximation for Fair Correlation Clustering

Correlation clustering is a ubiquitous paradigm in unsupervised machine ...

Please sign up or login with your details

Forgot password? Click here to reset