Higher Criticism Tuned Regression For Weak And Sparse Signals

02/01/2020
by   Tao Jiang, et al.
0

Here we propose a novel searching scheme for a tuning parameter in high-dimensional penalized regression methods to address variable selection and modeling when sample sizes are limited compared to the data dimensions. Our method is motivated by high-throughput biological data such as genome-wide association studies (GWAS) and epigenome-wide association studies (EWAS). We propose a new estimate of the regularization parameter λ in penalized regression methods based on an estimated lower bound of the proportion of false null hypotheses with confidence (1 - α). The bound is estimated by applying the empirical null distribution of the higher criticism statistic, a second-level significance test constructed by dependent p-values using a multi-split regression and aggregation method. A tuning parameter estimate in penalized regression, λ, corresponds with the lower bound of the proportion of false null hypotheses. Different penalized regression methods with varied signal sparsity and strength are compared in the multi-split method setting. We demonstrate the performance of our method using both simulation experiments and the applications of real data on (1) lipid-trait genetics from the Action to Control Cardiovascular Risk in Diabetes (ACCORD) clinical trial and (2) epigenetic analysis evaluating smoking's influence in differential methylation in the Agricultural Lung Health Study. The proposed algorithm is included in the HCTR package, available at https://cran.r-project.org/web/packages/HCTR/index.html.

READ FULL TEXT

page 6

page 12

page 13

page 14

research
09/20/2021

Variable Selection in GLM and Cox Models with Second-Generation P-Values

Variable selection has become a pivotal choice in data analyses that imp...
research
06/24/2021

Multiple Testing for Composite Null with FDR Control Guarantee

False discovery rate (FDR) controlling procedures provide important stat...
research
05/03/2018

REMI: Regression with marginal information and its application in genome-wide association studies

In this study, we consider the problem of variable selection and estimat...
research
12/27/2022

Weak Signal Inclusion Under Dependence and Applications in Genome-wide Association Study

Motivated by the inquiries of weak signals in underpowered genome-wide a...
research
01/12/2014

Inference in High Dimensions with the Penalized Score Test

In recent years, there has been considerable theoretical development reg...
research
08/04/2016

Iterative Hard Thresholding for Model Selection in Genome-Wide Association Studies

A genome-wide association study (GWAS) correlates marker variation with ...
research
04/04/2018

Variable selection using pseudo-variables

Penalized regression has become a standard tool for model building acros...

Please sign up or login with your details

Forgot password? Click here to reset