Weak Signal Inclusion Under Dependence and Applications in Genome-wide Association Study

12/27/2022
by   X. Jessie Jeng, et al.
0

Motivated by the inquiries of weak signals in underpowered genome-wide association studies (GWASs), we consider the problem of retaining true signals that are not strong enough to be individually separable from a large amount of noise. We address the challenge from the perspective of false negative control and present false negative control (FNC) screening, a data-driven method to efficiently regulate false negative proportion at a user-specified level. FNC screening is developed in a realistic setting with arbitrary covariance dependence between variables. We calibrate the overall dependence through a parameter whose scale is compatible with the existing phase diagram in high-dimensional sparse inference. Utilizing the new calibration, we asymptotically explicate the joint effect of covariance dependence, signal sparsity, and signal intensity on the proposed method. We interpret the results using a new phase diagram, which shows that FNC screening can efficiently select a set of candidate variables to retain a high proportion of signals even when the signals are not individually separable from noise. Finite sample performance of FNC screening is compared to those of several existing methods in simulation studies. The proposed method outperforms the others in adapting to a user-specified false negative control level. We implement FNC screening to empower a two-stage GWAS procedure, which demonstrates substantial power gain when working with limited sample sizes in real applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2020

Dual Control of Testing Errors in High-Dimensional Data Analysis

False negative errors are of major concern in applications where missing...
research
02/17/2021

Estimating The Proportion of Signal Variables Under Arbitrary Covariance Dependence

Estimating the proportion of signals hidden in a large amount of noise v...
research
05/27/2018

Adaptive Signal Inclusion With Genomic Applications

This paper addresses the challenge of efficiently capturing a high propo...
research
02/27/2020

False Discovery Rate Control Under General Dependence By Symmetrized Data Aggregation

We develop a new class of distribution–free multiple testing rules for f...
research
02/01/2020

Higher Criticism Tuned Regression For Weak And Sparse Signals

Here we propose a novel searching scheme for a tuning parameter in high-...
research
07/27/2022

Model-Free, Monotone Invariant and Computationally Efficient Feature Screening with Data-adaptive Threshold

Feature screening for ultrahigh-dimension, in general, proceeds with two...
research
05/17/2018

Covariance-Insured Screening

Modern bio-technologies have produced a vast amount of high-throughput d...

Please sign up or login with your details

Forgot password? Click here to reset