Improved Approximation and Scalability for Fair Max-Min Diversification

01/18/2022
āˆ™
by   Raghavendra Addanki, et al.
āˆ™
0
āˆ™

Given an n-point metric space (š’³,d) where each point belongs to one of m=O(1) different categories or groups and a set of integers k_1, ā€¦, k_m, the fair Max-Min diversification problem is to select k_i points belonging to category iāˆˆ [m], such that the minimum pairwise distance between selected points is maximized. The problem was introduced by Moumoulidou et al. [ICDT 2021] and is motivated by the need to down-sample large data sets in various applications so that the derived sample achieves a balance over diversity, i.e., the minimum distance between a pair of selected points, and fairness, i.e., ensuring enough points of each category are included. We prove the following results: 1. We first consider general metric spaces. We present a randomized polynomial time algorithm that returns a factor 2-approximation to the diversity but only satisfies the fairness constraints in expectation. Building upon this result, we present a 6-approximation that is guaranteed to satisfy the fairness constraints up to a factor 1-Ļµ for any constant Ļµ. We also present a linear time algorithm returning an m+1 approximation with exact fairness. The best previous result was a 3m-1 approximation. 2. We then focus on Euclidean metrics. We first show that the problem can be solved exactly in one dimension. For constant dimensions, categories and any constant Ļµ>0, we present a 1+Ļµ approximation algorithm that runs in O(nk) + 2^O(k) time where k=k_1+ā€¦+k_m. We can improve the running time to O(nk)+ poly(k) at the expense of only picking (1-Ļµ) k_i points from category iāˆˆ [m]. Finally, we present algorithms suitable to processing massive data sets including single-pass data stream algorithms and composable coresets for the distributed processing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
āˆ™ 05/04/2022

Max-Min k-Dispersion on a Convex Polygon

In this paper, we consider the following k-dispersion problem. Given a s...
research
āˆ™ 07/30/2022

Streaming Algorithms for Diversity Maximization with Fairness Constraints

Diversity maximization is a fundamental problem with wide applications i...
research
āˆ™ 05/19/2021

Approximation Algorithms For The Euclidean Dispersion Problems

In this article, we consider the Euclidean dispersion problems. Let P={p...
research
āˆ™ 11/20/2022

Probabilistic bounds on the k-Traveling Salesman Problem and the Traveling Repairman Problem

The k-traveling salesman problem (k-TSP) seeks a tour of minimal length ...
research
āˆ™ 10/18/2020

Diverse Data Selection under Fairness Constraints

Diversity is an important principle in data selection and summarization,...
research
āˆ™ 05/24/2019

Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms

Learning Mahalanobis metric spaces is an important problem that has foun...
research
āˆ™ 02/28/2019

Fair Dimensionality Reduction and Iterative Rounding for SDPs

We model "fair" dimensionality reduction as an optimization problem. A c...

Please sign up or login with your details

Forgot password? Click here to reset