ecpc: An R-package for generic co-data models for high-dimensional prediction

05/16/2022
by   Mirrelijn M. van Nee, et al.
0

High-dimensional prediction considers data with more variables than samples. Generic research goals are to find the best predictor or to select variables. Results may be improved by exploiting prior information in the form of co-data, providing complementary data not on the samples, but on the variables. We consider adaptive ridge penalised generalised linear and Cox models, in which the variable specific ridge penalties are adapted to the co-data to give a priori more weight to more important variables. The R-package ecpc originally accommodated various and possibly multiple co-data sources, including categorical co-data, i.e. groups of variables, and continuous co-data. Continuous co-data, however, was handled by adaptive discretisation, potentially inefficiently modelling and losing information. Here, we present an extension to the method and software for generic co-data models, particularly for continuous co-data. At the basis lies a classical linear regression model, regressing prior variance weights on the co-data. Co-data variables are then estimated with empirical Bayes moment estimation. After placing the estimation procedure in the classical regression framework, extension to generalised additive and shape constrained co-data models is straightforward. Besides, we show how ridge penalties may be transformed to elastic net penalties with the R-package squeezy. In simulation studies we first compare various co-data models for continuous co-data from the extension to the original method. Secondly, we compare variable selection performance to other variable selection methods. Moreover, we demonstrate use of the package in several examples throughout the paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2020

Flexible co-data learning for high-dimensional prediction

Clinical research often focuses on complex traits in which many variable...
research
03/05/2020

Exploiting disagreement between high-dimensional variable selectors for uncertainty visualization

We propose Combined Selection and Uncertainty Visualizer (CSUV), which e...
research
09/23/2021

High-dimensional regression with potential prior information on variable importance

There are a variety of settings where vague prior information may be ava...
research
01/11/2021

Fast marginal likelihood estimation of penalties for group-adaptive elastic net

Nowadays, clinical research routinely uses omics data, such as gene expr...
research
11/17/2022

Penalized Variable Selection with Broken Adaptive Ridge Regression for Semi-competing Risks Data

Semi-competing risks data arise when both non-terminal and terminal even...
research
08/02/2018

High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

Penalized likelihood methods are widely used for high-dimensional regres...
research
03/16/2020

Variable selection with multiply-imputed datasets: choosing between stacked and grouped methods

Penalized regression methods, such as lasso and elastic net, are used in...

Please sign up or login with your details

Forgot password? Click here to reset