Model-free Feature Screening and FDR Control with Knockoff Features

08/19/2019
by   Wanjun Liu, et al.
1

This paper proposes a model-free and data-adaptive feature screening method for ultra-high dimensional datasets. The proposed method is based on the projection correlation which measures the dependence between two random vectors. This projection correlation based method does not require specifying a regression model and applies to the data in the presence of heavy-tailed errors and multivariate response. It enjoys both sure screening and rank consistency properties under weak assumptions. Further, a two-step approach is proposed to control the false discovery rate (FDR) in feature screening with the help of knockoff features. It can be shown that the proposed two-step approach enjoys both sure screening and FDR control if the pre-specified FDR level α is greater or equal to 1/s, where s is the number of active features. The superior empirical performance of the proposed methods is justified by various numerical experiments and real data applications.

READ FULL TEXT
research
08/19/2019

Model-free Feature Screening with Projection Correlation and FDR Control with Knockoff Features

This paper proposes a model-free and data-adaptive feature screening met...
research
05/08/2022

On Exact Feature Screening in Ultrahigh-dimensional Binary Classification

We propose a new model-free feature screening method based on energy dis...
research
12/26/2022

Robust distance correlation for variable screening

High-dimensional data are commonly seen in modern statistical applicatio...
research
10/07/2021

Distribution-free and Model-free Multivariate Feature Screening via Multivariate Rank Distance Correlation

Feature screening approaches are effective in selecting active features ...
research
06/04/2018

Data-driven Localization and Estimation of Disturbance in the Interconnected Power System

Identifying the location of a disturbance and its magnitude is an import...
research
11/16/2019

Marginal and Interactive Feature Screening of Ultra-high Dimensional Feature Spaces with Multivariate Response

When the number of features exponentially outnumbers the number of sampl...
research
07/27/2022

Model-Free, Monotone Invariant and Computationally Efficient Feature Screening with Data-adaptive Threshold

Feature screening for ultrahigh-dimension, in general, proceeds with two...

Please sign up or login with your details

Forgot password? Click here to reset