Relaxing the Assumptions of Knockoffs by Conditioning

03/07/2019
by   Dongming Huang, et al.
0

The recent paper Candès et al. (2018) introduced model-X knockoffs, a method for variable selection that provably and non-asymptotically controls the false discovery rate with no restrictions or assumptions on the dimensionality of the data or the conditional distribution of the response given the covariates. The one requirement for the procedure is that the covariate samples are drawn independently and identically from a precisely-known (but arbitrary) distribution. The present paper shows that the exact same guarantees can be made without knowing the covariate distribution fully, but instead knowing it only up to a parametric model with as many as Ω(n^*p) parameters, where p is the dimension and n^* is the number of covariate samples (which may exceed the usual sample size n of labeled samples when unlabeled samples are also available). The key is to treat the covariates as if they are drawn conditionally on their observed value for a sufficient statistic of the model. Although this idea is simple, even in Gaussian models conditioning on a sufficient statistic leads to a distribution supported on a set of zero Lebesgue measure, requiring techniques from topological measure theory to establish valid algorithms. We demonstrate how to do this for three models of interest, with simulations showing the new approach remains powerful under the weaker assumptions.

READ FULL TEXT
research
02/03/2022

Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities

In this paper we give a completely new approach to the problem of covari...
research
10/05/2020

A Power Analysis of the Conditional Randomization Test and Knockoffs

In many scientific problems, researchers try to relate a response variab...
research
06/30/2021

AdaPT-GMM: Powerful and robust covariate-assisted multiple testing

We propose a new empirical Bayes method for covariate-assisted multiple ...
research
06/05/2019

A Model-free Approach to Linear Least Squares Regression with Exact Probabilities and Applications to Covariate Selection

The classical model for linear regression is Y= xβ +σε with i.i.d. stan...
research
06/11/2018

Valid Post-selection Inference in Assumption-lean Linear Regression

Construction of valid statistical inference for estimators based on data...
research
06/15/2020

The leave-one-covariate-out conditional randomization test

Conditional independence testing is an important problem, yet provably h...
research
02/21/2020

Adaptive Covariate Acquisition for Minimizing Total Cost of Classification

In some applications, acquiring covariates comes at a cost which is not ...

Please sign up or login with your details

Forgot password? Click here to reset