Fair Column Subset Selection

06/07/2023
by   Antonis Matakos, et al.
0

We consider the problem of fair column subset selection. In particular, we assume that two groups are present in the data, and the chosen column subset must provide a good approximation for both, relative to their respective best rank-k approximations. We show that this fair setting introduces significant challenges: in order to extend known results, one cannot do better than the trivial solution of simply picking twice as many columns as the original methods. We adopt a known approach based on deterministic leverage-score sampling, and show that merely sampling a subset of appropriate size becomes NP-hard in the presence of two groups. Whereas finding a subset of two times the desired size is trivial, we provide an efficient algorithm that achieves the same guarantees with essentially 1.5 times that size. We validate our methods through an extensive set of experiments on real-world data.

READ FULL TEXT
research
05/17/2015

Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data

We consider the problem of matrix column subset selection, which selects...
research
07/16/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

We give the first single-pass streaming algorithm for Column Subset Sele...
research
07/24/2023

A Statistical View of Column Subset Selection

We consider the problem of selecting a small subset of representative va...
research
04/19/2023

Column Subset Selection and Nyström Approximation via Continuous Optimization

We propose a continuous optimization algorithm for the Column Subset Sel...
research
10/30/2019

Optimal Analysis of Subset-Selection Based L_p Low Rank Approximation

We study the low rank approximation problem of any given matrix A over R...
research
11/01/2017

Sampling and multilevel coarsening algorithms for fast matrix approximations

This paper addresses matrix approximation problems for matrices that are...
research
12/31/2020

Exploiting Transitivity for Top-k Selection with Score-Based Dueling Bandits

We consider the problem of top-k subset selection in Dueling Bandit prob...

Please sign up or login with your details

Forgot password? Click here to reset