Fair and Diverse DPP-based Data Summarization

02/12/2018
by   L. Elisa Celis, et al.
0

Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization. However, recent studies have noted the occurrence of bias (under- or over-representation of a certain gender or race) in such data summarization methods. In this paper we initiate a study of the problem of outputting a diverse and fair summary of a given dataset. We work with a well-studied determinantal measure of diversity and corresponding distributions (DPPs) and present a framework that allows us to incorporate a general class of fairness constraints into such distributions. Coming up with efficient algorithms to sample from these constrained determinantal distributions, however, suffers from a complexity barrier and we present a fast sampler that is provably good when the input vectors satisfy a natural property. Our experimental results on a real-world and an image dataset show that the diversity of the samples produced by adding fairness constraints is not too far from the unconstrained case, and we also provide a theoretical explanation of it.

READ FULL TEXT
research
10/31/2018

Crowdsourcing with Fairness, Diversity and Budget Constraints

Recent studies have shown that the labels collected from crowdworkers ca...
research
07/30/2022

Streaming Algorithms for Diversity Maximization with Fairness Constraints

Diversity maximization is a fundamental problem with wide applications i...
research
06/05/2019

Fair Distributions from Biased Samples: A Maximum Entropy Optimization Framework

One reason for the emergence of bias in AI systems is biased data -- dat...
research
10/18/2020

Diverse Data Selection under Fairness Constraints

Diversity is an important principle in data selection and summarization,...
research
07/15/2020

Dialect Diversity in Text Summarization on Twitter

Extractive summarization algorithms can be used on Twitter data to retur...
research
08/01/2016

On the Complexity of Constrained Determinantal Point Processes

Determinantal Point Processes (DPPs) are probabilistic models that arise...

Please sign up or login with your details

Forgot password? Click here to reset