KL Divergence Estimation with Multi-group Attribution

02/28/2022
by   Parikshit Gopalan, et al.
12

Estimating the Kullback-Leibler (KL) divergence between two distributions given samples from them is well-studied in machine learning and information theory. Motivated by considerations of multi-group fairness, we seek KL divergence estimates that accurately reflect the contributions of sub-populations to the overall divergence. We model the sub-populations coming from a rich (possibly infinite) family 𝒞 of overlapping subsets of the domain. We propose the notion of multi-group attribution for 𝒞, which requires that the estimated divergence conditioned on every sub-population in 𝒞 satisfies some natural accuracy and fairness desiderata, such as ensuring that sub-populations where the model predicts significant divergence do diverge significantly in the two distributions. Our main technical contribution is to show that multi-group attribution can be derived from the recently introduced notion of multi-calibration for importance weights [HKRR18, GRSW21]. We provide experimental evidence to support our theoretical results, and show that multi-group attribution provides better KL divergence estimates when conditioned on sub-populations than other popular algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2020

A maximum value for the Kullback-Leibler divergence between quantum discrete distributions

This work presents an upper-bound for the maximum value that the Kullbac...
research
11/04/2020

Independent Gaussian Distributions Minimize the Kullback-Leibler (KL) Divergence from Independent Gaussian Distributions

This short note is on a property of the Kullback-Leibler (KL) divergence...
research
02/25/2020

Reliable Estimation of Kullback-Leibler Divergence by Controlling Discriminator Complexity in the Reproducing Kernel Hilbert Space

Several scalable methods to compute the Kullback Leibler (KL) divergence...
research
10/26/2020

Interpretable Assessment of Fairness During Model Evaluation

For companies developing products or algorithms, it is important to unde...
research
06/02/2023

KL-Divergence Guided Temperature Sampling

Temperature sampling is a conventional approach to diversify large langu...
research
03/10/2021

Multicalibrated Partitions for Importance Weights

The ratio between the probability that two distributions R and P give to...
research
10/30/2015

Principal Differences Analysis: Interpretable Characterization of Differences between Distributions

We introduce principal differences analysis (PDA) for analyzing differen...

Please sign up or login with your details

Forgot password? Click here to reset