Finding Skewed Subcubes Under a Distribution

11/18/2019
by   Parikshit Gopalan, et al.
0

Say that we are given samples from a distribution ψ over an n-dimensional space. We expect or desire ψ to behave like a product distribution (or a k-wise independent distribution over its marginals for small k). We propose the problem of enumerating/list-decoding all large subcubes where the distribution ψ deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in ψ. We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection. In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a minimal skewed subcube, which are subcubes whose skew cannot be attributed to a larger subcube that contains it. Our main technical contribution is a list-size bound for this definition and an algorithm to efficiently find all such subcubes. Both the bound and the algorithm rely on Fourier-analytic techniques, especially the powerful hypercontractive inequality. On the lower bounds side, we show that finding skewed subcubes is as hard as the sparse noisy parity problem, and hence our algorithms cannot be improved on substantially without a breakthrough on this problem which is believed to be intractable. Motivated by this, we study alternate models allowing query access to ψ where finding skewed subcubes might be easier.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2022

Optimal list decoding from noisy entropy inequality

A noisy entropy inequality for boolean functions by Samorodnitsky is app...
research
04/22/2019

Almost Optimal Testers for Concise Representations

We give improved and almost optimal testers for several classes of Boole...
research
09/04/2023

Tight Bounds on List-Decodable and List-Recoverable Zero-Rate Codes

In this work, we consider the list-decodability and list-recoverability ...
research
11/04/2019

Combinatorial list-decoding of Reed-Solomon codes beyond the Johnson radius

List-decoding of Reed-Solomon (RS) codes beyond the so called Johnson ra...
research
09/06/2021

Matrix hypercontractivity, streaming algorithms and LDCs: the large alphabet case

In this work, we prove a hypercontractive inequality for matrix-valued f...
research
05/14/2019

List-Decodable Linear Regression

We give the first polynomial-time algorithm for robust regression in the...
research
01/03/2020

On the definition of a concentration function relevant to the ROC curve

This is a reader's reaction to a recent paper by E. Schechtman and G. Sc...

Please sign up or login with your details

Forgot password? Click here to reset