Fast and Powerful Conditional Randomization Testing via Distillation

06/06/2020
by   Molei Liu, et al.
0

In relating a response variable Y to covariates (Z,X), a key question is whether Y is independent of the covariate X given Z. This question can be answered through conditional independence testing, and the conditional randomization test (CRT) was recently proposed by Candès et al. (2018) as a way to use distributional information about X| Z to exactly (non-asymptotically) test for conditional independence between X and Y using any test statistic in any dimensionality without assuming anything about Y| (Z,X). This flexibility in principle allows one to derive powerful test statistics from complex state-of-the-art machine learning algorithms while maintaining exact statistical control of Type 1 errors. Yet the direct use of such advanced test statistics in the CRT is prohibitively computationally expensive, especially with multiple testing, due to the CRT's requirement to recompute the test statistic many times on resampled data. In this paper we propose a novel approach, called distillation, to using state-of-the-art machine learning algorithms in the CRT while drastically reducing the number of times those algorithms need to be run, thereby taking advantage of their power and the CRT's statistical guarantees without suffering the usual computational expense associated with their use in the CRT. In addition to distillation, we propose a number of other tricks to speed up the CRT without sacrificing its strong statistical guarantees, and show in simulation that all our proposals combined lead to a test that has the same power as the CRT but requires orders of magnitude less computation, making it a practical and powerful tool even for large data sets. We demonstrate our method's speed and power on a breast cancer dataset by identifying biomarkers related to cancer stage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2022

Learning to Increase the Power of Conditional Randomization Tests

The model-X conditional randomization test is a generic framework for co...
research
10/29/2021

Pearson Chi-squared Conditional Randomization Test

Conditional independence (CI) testing arises naturally in many scientifi...
research
08/18/2022

DIET: Conditional independence testing with marginal dependence measures of residual information

Conditional randomization tests (CRTs) assess whether a variable x is pr...
research
10/06/2021

Deploying the Conditional Randomization Test in High Multiplicity Problems

This paper introduces the sequential CRT, which is a variable selection ...
research
05/12/2020

A theoretical treatment of conditional independence testing under Model-X

For testing conditional independence (CI) of a response Y and a predicto...
research
12/01/2021

Conditional Randomization Rank Test

We propose a new method named the Conditional Randomization Rank Test (C...
research
03/12/2022

Maxway CRT: Improving the Robustness of Model-X Inference

The model-X conditional randomization test (CRT) proposed by Candès et a...

Please sign up or login with your details

Forgot password? Click here to reset