Simultaneous inference for generalized linear models with unmeasured confounders

09/13/2023
by   Jin-Hong Du, et al.
0

Tens of thousands of simultaneous hypothesis tests are routinely performed in genomic studies to identify differentially expressed genes. However, due to unmeasured confounders, many standard statistical approaches may be substantially biased. This paper investigates the large-scale hypothesis testing problem for multivariate generalized linear models in the presence of confounding effects. Under arbitrary confounding mechanisms, we propose a unified statistical estimation and inference framework that harnesses orthogonal structures and integrates linear projections into three key stages. It first leverages multivariate responses to separate marginal and uncorrelated confounding effects, recovering the confounding coefficients' column space. Subsequently, latent factors and primary effects are jointly estimated, utilizing ℓ_1-regularization for sparsity while imposing orthogonality onto confounding coefficients. Finally, we incorporate projected and weighted bias-correction steps for hypothesis testing. Theoretically, we establish various effects' identification conditions and non-asymptotic error bounds. We show effective Type-I error control of asymptotic z-tests as sample and response sizes approach infinity. Numerical experiments demonstrate that the proposed method controls the false discovery rate by the Benjamini-Hochberg procedure and is more powerful than alternative methods. By comparing single-cell RNA-seq counts from two groups of samples, we demonstrate the suitability of adjusting confounding effects when significant covariates are absent from the model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2022

A Decorrelating and Debiasing Approach to Simultaneous Inference for High-Dimensional Confounded Models

Motivated by the simultaneous association analysis with the presence of ...
research
09/07/2022

High-dimensional Inference for Generalized Linear Models with Hidden Confounding

Statistical inferences for high-dimensional regression models have been ...
research
04/29/2019

Individualized Treatment Selection: An Optimal Hypothesis Testing Approach In High-dimensional Models

The ability to predict individualized treatment effects (ITEs) based on ...
research
10/12/2021

Causal Mediation Analysis: Selection with Asymptotically Valid Inference

Researchers are often interested in learning not only the effect of trea...
research
08/31/2022

Two-stage Hypothesis Tests for Variable Interactions with FDR Control

In many scenarios such as genome-wide association studies where dependen...
research
08/22/2023

A one-step spatial+ approach to mitigate spatial confounding in multivariate spatial areal models

Ecological spatial areal models encounter the well-known and challenging...
research
10/04/2018

A statistical normalization method and differential expression analysis for RNA-seq data between different species

Background: High-throughput techniques bring novel tools but also statis...

Please sign up or login with your details

Forgot password? Click here to reset