Testing Mixtures of Discrete Distributions

07/06/2019
by   Maryam Aliakbarpour, et al.
3

There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size n, distinguishing the uniform distribution from distributions that are far from uniform in ℓ_1-distance uses only O(√(n)) samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is ϵ-close to uniform from the case where the distribution is (1-ϵ)-far from uniform. The latter task requires nearly linear in n samples [Valiant 2008, Valian and Valiant 2011]. In this work, we present a noise model that on one hand is more tractable for the testing problem, and on the other hand represents a rich class of noise families. In our model, the noisy distribution is a mixture of the original distribution and noise, where the latter is known to the tester either explicitly or via sample access; the form of the noise is also known a priori. Focusing on the identity and closeness testing problems leads to the following mixture testing question: Given samples of distributions p, q_1,q_2, can we test if p is a mixture of q_1 and q_2? We consider this general question in various scenarios that differ in terms of how the tester can access the distributions, and show that indeed this problem is more tractable. Our results show that the sample complexity of our testers are exactly the same as for the classical non-mixture case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2019

Testing Properties of Multiple Distributions with Few Samples

We propose a new setting for testing properties of distributions while r...
research
08/15/2017

Generalized Uniformity Testing

In this work, we revisit the problem of uniformity testing of discrete p...
research
12/24/2022

Testing Distributions of Huge Objects

We initiate a study of a new model of property testing that is a hybrid ...
research
12/03/2020

Comparison Graphs: a Unified Method for Uniformity Testing

Distribution testing can be described as follows: q samples are being dr...
research
08/27/2023

Testing Junta Truncation

We consider the basic statistical problem of detecting truncation of the...
research
06/21/2022

Sharp Constants in Uniformity Testing via the Huber Statistic

Uniformity testing is one of the most well-studied problems in property ...
research
08/02/2022

Bias Reduction for Sum Estimation

In classical statistics and distribution testing, it is often assumed th...

Please sign up or login with your details

Forgot password? Click here to reset