Sequential algorithms for testing identity and closeness of distributions

05/12/2022
βˆ™
by   Omar Fawzi, et al.
βˆ™
0
βˆ™

What advantage do sequential procedures provide over batch algorithms for testing properties of unknown distributions? Focusing on the problem of testing whether two distributions π’Ÿ_1 and π’Ÿ_2 on {1,…, n} are equal or Ο΅-far, we give several answers to this question. We show that for a small alphabet size n, there is a sequential algorithm that outperforms any batch algorithm by a factor of at least 4 in terms sample complexity. For a general alphabet size n, we give a sequential algorithm that uses no more samples than its batch counterpart, and possibly fewer if the actual distance TV(π’Ÿ_1, π’Ÿ_2) between π’Ÿ_1 and π’Ÿ_2 is larger than Ο΅. As a corollary, letting Ο΅ go to 0, we obtain a sequential algorithm for testing closeness when no a priori bound on TV(π’Ÿ_1, π’Ÿ_2) is given that has a sample complexity π’ͺΜƒ(n^2/3/TV(π’Ÿ_1, π’Ÿ_2)^4/3): this improves over the π’ͺΜƒ(n/log n/TV(π’Ÿ_1, π’Ÿ_2)^2) tester of <cit.> and is optimal up to multiplicative constants. We also establish limitations of sequential algorithms for the problem of testing identity and closeness: they can improve the worst case number of samples by at most a constant factor.

READ FULL TEXT
research
βˆ™ 10/28/2017

Wasserstein Identity Testing

Uniformity testing and the more general identity testing are well studie...
research
βˆ™ 11/17/2019

Testing Properties of Multiple Distributions with Few Samples

We propose a new setting for testing properties of distributions while r...
research
βˆ™ 06/21/2022

Sharp Constants in Uniformity Testing via the Huber Statistic

Uniformity testing is one of the most well-studied problems in property ...
research
βˆ™ 01/17/2018

An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients

In this technical report, we consider an approach that combines the PPO ...
research
βˆ™ 06/24/2023

On Scalable Testing of Samplers

In this paper we study the problem of testing of constrained samplers ov...
research
βˆ™ 09/06/2023

Testing properties of distributions in the streaming model

We study distribution testing in the standard access model and the condi...
research
βˆ™ 02/10/2019

The Optimal Approximation Factor in Density Estimation

Consider the following problem: given two arbitrary densities q_1,q_2 an...

Please sign up or login with your details

Forgot password? Click here to reset