A Power Analysis of the Conditional Randomization Test and Knockoffs

10/05/2020
by   Wenshuo Wang, et al.
0

In many scientific problems, researchers try to relate a response variable Y to a set of potential explanatory variables X = (X_1,…,X_p), and start by trying to identify variables that contribute to this relationship. In statistical terms, this goal can be posed as trying to identify X_j's upon which Y is conditionally dependent. Sometimes it is of value to simultaneously test for each j, which is more commonly known as variable selection. The conditional randomization test (CRT) and model-X knockoffs are two recently proposed methods that respectively perform conditional independence testing and variable selection by, for each X_j, computing any test statistic on the data and assessing that test statistic's significance by comparing it to test statistics computed on synthetic variables generated using knowledge of X's distribution. Our main contribution is to analyze their power in a high-dimensional linear model where the ratio of the dimension p and the sample size n converge to a positive constant. We give explicit expressions of the asymptotic power of the CRT, variable selection with CRT p-values, and model-X knockoffs, each with a test statistic based on either the marginal covariance, the least squares coefficient, or the lasso. One useful application of our analysis is the direct theoretical comparison of the asymptotic powers of variable selection with CRT p-values and model-X knockoffs; in the instances with independent covariates that we consider, the CRT provably dominates knockoffs. We also analyze the power gain from using unlabeled data in the CRT when limited knowledge of X's distribution is available, and the power of the CRT when samples are collected retrospectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2022

Conditional Variable Selection for Intelligent Test

Intelligent test requires efficient and effective analysis of high-dimen...
research
10/08/2019

On the feasibility of parsimonious variable selection for Hotelling's T^2-test

Hotelling's T^2-test for the mean of a multivariate normal distribution ...
research
03/07/2019

Relaxing the Assumptions of Knockoffs by Conditioning

The recent paper Candès et al. (2018) introduced model-X knockoffs, a me...
research
10/14/2019

All of Linear Regression

Least squares linear regression is one of the oldest and widely used dat...
research
10/27/2019

A simple measure of conditional dependence

We propose a coefficient of conditional dependence between two random va...
research
09/30/2022

Experts in the Loop: Conditional Variable Selection for Accelerating Post-Silicon Analysis Based on Deep Learning

Post-silicon validation is one of the most critical processes in modern ...
research
10/06/2021

Deploying the Conditional Randomization Test in High Multiplicity Problems

This paper introduces the sequential CRT, which is a variable selection ...

Please sign up or login with your details

Forgot password? Click here to reset