Model-Powered Conditional Independence Test

09/18/2017
by   Rajat Sen, et al.
0

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution f(x,y,z) of continuous random vectors X,Y and Z, we determine whether X Y | Z. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution f^CI(x,y,z) = f(x|z)f(y|z)f(z) -- the joint distribution if and only if X Y | Z. -- when given access only to i.i.d. samples from the true joint distribution f(x,y,z). To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to f^CI in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2018

Mimic and Classify : A meta-algorithm for Conditional Independence Testing

Given independent samples generated from the joint distribution p(x,y,z)...
research
04/09/2023

Nearest-Neighbor Sampling Based Conditional Independence Testing

The conditional randomization test (CRT) was recently proposed to test w...
research
01/12/2020

Reproducible Bootstrap Aggregating

Heterogeneity between training and testing data degrades reproducibility...
research
07/31/2019

Conditional independence testing: a predictive perspective

Conditional independence testing is a key problem required by many machi...
research
04/19/2022

Independence Testing for Bounded Degree Bayesian Network

We study the following independence testing problem: given access to sam...
research
01/09/2020

Minimax Optimal Conditional Independence Testing

We consider the problem of conditional independence testing of X and Y g...
research
06/16/2016

Unsupervised Risk Estimation Using Only Conditional Independence Structure

We show how to estimate a model's test error from unlabeled data, on dis...

Please sign up or login with your details

Forgot password? Click here to reset