Fair Distributions from Biased Samples: A Maximum Entropy Optimization Framework

06/05/2019
by   L. Elisa Celis, et al.
0

One reason for the emergence of bias in AI systems is biased data -- datasets that may not be true representations of the underlying distributions -- and may over or under-represent groups with respect to protected attributes such as gender or race. We consider the problem of correcting such biases and learning distributions that are "fair", with respect to measures such as proportional representation and statistical parity, from the given samples. Our approach is based on a novel formulation of the problem of learning a fair distribution as a maximum entropy optimization problem with a given expectation vector and a prior distribution. Technically, our main contributions are: (1) a new second-order method to compute the (dual of the) maximum entropy distribution over an exponentially-sized discrete domain that turns out to be faster than previous methods, and (2) methods to construct prior distributions and expectation vectors that provably guarantee that the learned distributions satisfy a wide class of fairness criteria. Our results also come with quantitative bounds on the total variation distance between the empirical distribution obtained from the samples and the learned fair distribution. Our experimental results include testing our approach on the COMPAS dataset and showing that the fair distributions not only improve disparate impact values but when used to train classifiers only incur a small loss of accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2019

Learning Fair Representations via an Adversarial Framework

Fairness has become a central issue for our research community as classi...
research
02/12/2018

Fair and Diverse DPP-based Data Summarization

Sampling methods that choose a subset of the data proportional to its di...
research
11/06/2017

Computing Maximum Entropy Distributions Everywhere

We study the problem of computing the maximum entropy distribution with ...
research
06/27/2019

Learning Fair Representations for Kernel Models

Fair representations are a powerful tool for establishing criteria like ...
research
05/02/2022

A Novel Approach to Fairness in Automated Decision-Making using Affective Normalization

Any decision, such as one about who to hire, involves two components. Fi...
research
01/18/2021

Optimal Pre-Processing to Achieve Fairness and Its Relationship with Total Variation Barycenter

We use disparate impact, i.e., the extent that the probability of observ...
research
10/17/2019

An Information-Theoretic Perspective on the Relationship Between Fairness and Accuracy

Our goal is to understand the so-called trade-off between fairness and a...

Please sign up or login with your details

Forgot password? Click here to reset