The Price of Fair PCA: One Extra Dimension

10/31/2018
by   Samira Samadi, et al.
0

We investigate whether the standard dimensionality reduction technique of PCA inadvertently produces data representations with different fidelity for two different populations. We show on several real-world data sets, PCA has higher reconstruction error on population A than on B (for example, women versus men or lower- versus higher-educated individuals). This can happen even when the data set has a similar number of samples from A and B. This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A and B. We define the notion of Fair PCA and give a polynomial-time algorithm for finding a low dimensional representation of the data which is nearly-optimal with respect to this measure. Finally, we show on real-world data sets that our algorithm can be used to efficiently generate a fair low dimensional representation of the data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2019

Fair Dimensionality Reduction and Iterative Rounding for SDPs

We model "fair" dimensionality reduction as an optimization problem. A c...
research
02/22/2021

Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders

High-dimensional data sets are often analyzed and explored via the const...
research
03/08/2021

Empirical comparison between autoencoders and traditional dimensionality reduction methods

In order to process efficiently ever-higher dimensional data such as ima...
research
09/16/2020

PCA Reduced Gaussian Mixture Models with Applications in Superresolution

Despite the rapid development of computational hardware, the treatment o...
research
07/10/2018

A GPU-Oriented Algorithm Design for Secant-Based Dimensionality Reduction

Dimensionality-reduction techniques are a fundamental tool for extractin...
research
09/27/2022

Linear Dimensionality Reduction

These notes are an overview of some classical linear methods in Multivar...
research
05/07/2019

Guided Visual Exploration of Relations in Data Sets

Efficient explorative data analysis systems must take into account both ...

Please sign up or login with your details

Forgot password? Click here to reset