Streaming Algorithms for Diversity Maximization with Fairness Constraints

07/30/2022
by   Yanhao Wang, et al.
2

Diversity maximization is a fundamental problem with wide applications in data summarization, web search, and recommender systems. Given a set X of n elements, it asks to select a subset S of k ≪ n elements with maximum diversity, as quantified by the dissimilarities among the elements in S. In this paper, we focus on the diversity maximization problem with fairness constraints in the streaming setting. Specifically, we consider the max-min diversity objective, which selects a subset S that maximizes the minimum distance (dissimilarity) between any pair of distinct elements within it. Assuming that the set X is partitioned into m disjoint groups by some sensitive attribute, e.g., sex or race, ensuring fairness requires that the selected subset S contains k_i elements from each group i ∈ [1,m]. A streaming algorithm should process X sequentially in one pass and return a subset with maximum diversity while guaranteeing the fairness constraint. Although diversity maximization has been extensively studied, the only known algorithms that can work with the max-min diversity objective and fairness constraints are very inefficient for data streams. Since diversity maximization is NP-hard in general, we propose two approximation algorithms for fair diversity maximization in data streams, the first of which is 1-ε/4-approximate and specific for m=2, where ε∈ (0,1), and the second of which achieves a 1-ε/3m+2-approximation for an arbitrary m. Experimental results on real-world and synthetic datasets show that both algorithms provide solutions of comparable quality to the state-of-the-art algorithms while running several orders of magnitude faster in the streaming setting.

READ FULL TEXT

page 1

page 10

page 11

research
10/18/2020

Diverse Data Selection under Fairness Constraints

Diversity is an important principle in data selection and summarization,...
research
01/05/2023

Max-Min Diversification with Fairness Constraints: Exact and Approximation Algorithms

Diversity maximization aims to select a diverse and representative subse...
research
02/08/2020

A General Coreset-Based Approach to Diversity Maximization under Matroid Constraints

Diversity maximization is a fundamental problem in web search and data m...
research
01/18/2022

Improved Approximation and Scalability for Fair Max-Min Diversification

Given an n-point metric space (𝒳,d) where each point belongs to one of m...
research
02/12/2018

Fair and Diverse DPP-based Data Summarization

Sampling methods that choose a subset of the data proportional to its di...
research
09/25/2018

Diversity maximization in doubling metrics

Diversity maximization is an important geometric optimization problem wi...
research
11/01/2022

Composable Coresets for Constrained Determinant Maximization and Beyond

We study the task of determinant maximization under partition constraint...

Please sign up or login with your details

Forgot password? Click here to reset