High-dimensional regression over disease subgroups

11/03/2016
by   Frank Dondelinger, et al.
0

We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where disease subtypes, for example, may differ with respect to underlying regression models, but sample sizes at the subgroup-level may be limited. We focus on the case in which subgroup-specific models may be expected to be similar but not necessarily identical. Our approach is to treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an ℓ_1 term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer's disease, amyotrophic lateral sclerosis and cancer datasets. These examples demonstrate the gains our approach can offer in terms of prediction and the ability to estimate subgroup-specific sparsity patterns.

READ FULL TEXT
research
06/18/2019

Model selection for high-dimensional linear regression with dependent observations

We investigate the prediction capability of the orthogonal greedy algori...
research
05/24/2019

Cross validation approaches for penalized Cox regression

Cross validation is commonly used for selecting tuning parameters in pen...
research
01/10/2013

Network-based clustering with mixtures of L1-penalized Gaussian graphical models: an empirical investigation

In many applications, multivariate samples may harbor previously unrecog...
research
09/01/2023

Interpretation of High-Dimensional Linear Regression: Effects of Nullspace and Regularization Demonstrated on Battery Data

High-dimensional linear regression is important in many scientific field...
research
02/17/2022

Modeling High-Dimensional Data with Unknown Cut Points: A Fusion Penalized Logistic Threshold Regression

In traditional logistic regression models, the link function is often as...
research
02/07/2019

Concomitant Lasso with Repetitions (CLaR): beyond averaging multiple realizations of heteroscedastic noise

Sparsity promoting norms are frequently used in high dimensional regress...
research
01/31/2022

GenMod: A generative modeling approach for spectral representation of PDEs with random inputs

We propose a method for quantifying uncertainty in high-dimensional PDE ...

Please sign up or login with your details

Forgot password? Click here to reset