Statistical Inference for Genetic Relatedness Based on High-Dimensional Logistic Regression

by   Rong Ma, et al.

This paper studies the problem of statistical inference for genetic relatedness between binary traits based on individual-level genome-wide association data. Specifically, under the high-dimensional logistic regression model, we define parameters characterizing the cross-trait genetic correlation, the genetic covariance and the trait-specific genetic variance. A novel weighted debiasing method is developed for the logistic Lasso estimator and computationally efficient debiased estimators are proposed. The rates of convergence for these estimators are studied and their asymptotic normality is established under mild conditions. Moreover, we construct confidence intervals and statistical tests for these parameters, and provide theoretical justifications for the methods, including the coverage probability and expected length of the confidence intervals, as well as the size and power of the proposed tests. Numerical studies are conducted under both model generated data and simulated genetic data to show the superiority of the proposed methods and their applicability to the analysis of real genetic data. Finally, by analyzing a real data set on autoimmune diseases, we demonstrate the ability to obtain novel insights about the shared genetic architecture between ten pediatric autoimmune diseases.


page 1

page 2

page 3

page 4


Variance Estimation and Confidence Intervals from High-dimensional Genome-wide Association Studies Through Misspecified Mixed Model Analysis

We study variance estimation and associated confidence intervals for par...

Stochastic Approximation EM for Logistic Regression with Missing Values

Logistic regression is a common classification method in supervised lear...

Statistical Inference in High-Dimensional Generalized Linear Models with Asymmetric Link Functions

We have developed a statistical inference method applicable to a broad r...

Searching for genetic interactions in complex disease by using distance correlation

Understanding epistasis (genetic interaction) may shed some light on the...

Group Inference in High Dimensions with Applications to Hierarchical Testing

Group inference has been a long-standing question in statistics and the ...

Quantifying deviations from separability in space-time functional processes

The estimation of covariance operators of spatio-temporal data is in man...

Robust model-based estimation for binary outcomes in genomics studies

In quantitative genetics, statistical modeling techniques are used to fa...

Please sign up or login with your details

Forgot password? Click here to reset