Distribution Testing Under the Parity Trace

Distribution testing is a fundamental statistical task with many applications, but we are interested in a variety of problems where systematic mislabelings of the sample prevent us from applying the existing theory. To apply distribution testing to these problems, we introduce distribution testing under the parity trace, where the algorithm receives an ordered sample S that reveals only the least significant bit of each element. This abstraction reveals connections between the following three problems of interest, allowing new upper and lower bounds: 1. In distribution testing with a confused collector, the collector of the sample may be incapable of distinguishing between nearby elements of a domain (e.g. a machine learning classifier). We prove bounds for distribution testing with a confused collector on domains structured as a cycle or a path. 2. Recent work on the fundamental testing vs. learning question established tight lower bounds on distribution-free sample-based property testing by reduction from distribution testing, but the tightness is limited to symmetric properties. The parity trace allows a broader family of equivalences to non-symmetric properties, while recovering and strengthening many of the previous results with a different technique. 3. We give the first results for property testing in the well-studied trace reconstruction model, where the goal is to test whether an unknown string x satisfies some property or is far from satisfying that property, given only independent random traces of x. Our main technical result is a tight bound of Θ((n/ϵ)^4/5 + √(n)/ϵ^2) for testing uniformity of distributions over [n] under the parity trace, leading also to results for the problems above.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset