A likelihood-based approach for multivariate categorical response regression in high dimensions

07/15/2020
by   Aaron J. Molstad, et al.
0

We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient first order algorithm which we extend to settings where some subjects have only one response variable measured, i.e., the semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and prediction accuracy. An R package implementing the proposed method is available for download at github.com/ajmolstad/BvCategorical.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2021

Sufficient reductions in regression with mixed predictors

Most data sets comprise of measurements on continuous and categorical va...
research
06/21/2022

Conditional probability tensor decompositions for multivariate categorical response regression

In many modern regression applications, the response consists of multipl...
research
03/31/2023

Regression and Classification of Compositional Data via a novel Supervised Log Ratio Method

Compositional data in which only the relative abundances of variables ar...
research
10/14/2022

Variable Importance Based Interaction Modeling with an Application on Initial Spread of COVID-19 in China

Interaction selection for linear regression models with both continuous ...
research
08/29/2022

Multiresolution categorical regression for interpretable cell type annotation

In many categorical response regression applications, the response categ...
research
01/11/2023

Multivariate Regression via Enhanced Response Envelope: Envelope Regularization and Double Descent

The envelope model provides substantial efficiency gains over the standa...
research
04/04/2021

Scalable algorithms for semiparametric accelerated failure time models in high dimensions

Semiparametric accelerated failure time (AFT) models are a useful altern...

Please sign up or login with your details

Forgot password? Click here to reset