Distribution Testing Under the Parity Trace

Distribution testing is a fundamental statistical task with many applications, but we are interested in a variety of problems where systematic mislabelings of the sample prevent us from applying the existing theory. To apply distribution testing to these problems, we introduce distribution testing under the parity trace, where the algorithm receives an ordered sample S that reveals only the least significant bit of each element. This abstraction reveals connections between the following three problems of interest, allowing new upper and lower bounds: 1. In distribution testing with a confused collector, the collector of the sample may be incapable of distinguishing between nearby elements of a domain (e.g. a machine learning classifier). We prove bounds for distribution testing with a confused collector on domains structured as a cycle or a path. 2. Recent work on the fundamental testing vs. learning question established tight lower bounds on distribution-free sample-based property testing by reduction from distribution testing, but the tightness is limited to symmetric properties. The parity trace allows a broader family of equivalences to non-symmetric properties, while recovering and strengthening many of the previous results with a different technique. 3. We give the first results for property testing in the well-studied trace reconstruction model, where the goal is to test whether an unknown string x satisfies some property or is far from satisfying that property, given only independent random traces of x. Our main technical result is a tight bound of Θ((n/ϵ)^4/5 + √(n)/ϵ^2) for testing uniformity of distributions over [n] under the parity trace, leading also to results for the problems above.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2020

Lecture Note on LCSSX's Lower Bounds for Non-Adaptive Distribution-free Property Testing

In this lecture note we give Liu-Chen-Servedio-Sheng-Xie's (LCSSX) lower...
research
07/31/2023

New Lower Bounds for Testing Monotonicity and Log Concavity of Distributions

We develop a new technique for proving distribution testing lower bounds...
research
04/11/2019

Beyond trace reconstruction: Population recovery from the deletion channel

Population recovery is the problem of learning an unknown distribution o...
research
04/27/2020

Testing Data Binnings

Motivated by the question of data quantization and "binning," we revisit...
research
08/27/2023

Testing Junta Truncation

We consider the basic statistical problem of detecting truncation of the...
research
10/19/2021

Exploring the Gap between Tolerant and Non-tolerant Distribution Testing

The framework of distribution testing is currently ubiquitous in the fie...
research
07/25/2022

Testing of Index-Invariant Properties in the Huge Object Model

The study of distribution testing has become ubiquitous in the area of p...

Please sign up or login with your details

Forgot password? Click here to reset