Testing Conditional Independence of Discrete Distributions

11/30/2017
by   Clément L. Canonne, et al.
0

We study the problem of testing conditional independence for discrete distributions. Specifically, given samples from a discrete random variable (X, Y, Z) on domain [ℓ_1]×[ℓ_2] × [n], we want to distinguish, with probability at least 2/3, between the case that X and Y are conditionally independent given Z from the case that (X, Y, Z) is ϵ-far, in ℓ_1-distance, from every distribution that has this property. Conditional independence is a concept of central importance in probability and statistics with a range of applications in various scientific domains. As such, the statistical task of testing conditional independence has been extensively studied in various forms within the statistics and econometrics communities for nearly a century. Perhaps surprisingly, this problem has not been previously considered in the framework of distribution property testing and in particular no tester with sublinear sample complexity is known, even for the important special case that the domains of X and Y are binary. The main algorithmic result of this work is the first conditional independence tester with sublinear sample complexity for discrete distributions over [ℓ_1]×[ℓ_2] × [n]. To complement our upper bounds, we prove information-theoretic lower bounds establishing that the sample complexity of our algorithm is optimal, up to constant factors, for a number of settings. Specifically, for the prototypical setting when ℓ_1, ℓ_2 = O(1), we show that the sample complexity of testing conditional independence (upper bound and matching lower bound) is Θ((n^1/2/ϵ^2,(n^7/8/ϵ,n^6/7/ϵ^8/7))) .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2020

Optimal Testing of Discrete Distributions with High Probability

We study the problem of testing discrete distributions with a focus on t...
research
11/09/2018

Two Party Distribution Testing: Communication and Security

We study the problem of discrete distribution testing in the two-party s...
research
07/06/2022

Comments on "Testing Conditional Independence of Discrete Distributions"

In this short note, we identify and address an error in the proof of The...
research
02/24/2019

Testing Preferential Domains Using Sampling

A preferential domain is a collection of sets of preferences which are l...
research
02/09/2020

Monotone probability distributions over the Boolean cube can be learned with sublinear samples

A probability distribution over the Boolean cube is monotone if flipping...
research
01/09/2020

Minimax Optimal Conditional Independence Testing

We consider the problem of conditional independence testing of X and Y g...
research
07/19/2022

Identity Testing for High-Dimensional Distributions via Entropy Tensorization

We present improved algorithms and matching statistical and computationa...

Please sign up or login with your details

Forgot password? Click here to reset