High-dimensional sparse vine copula regression with application to genomic prediction

08/26/2022
by   Özge Sahin, et al.
0

High-dimensional data sets are often available in genome-enabled predictions. Such data sets include nonlinear relationships with complex dependence structures. For such situations, vine copula based (quantile) regression is an important tool. However, the current vine copula based regression approaches do not scale up to high and ultra-high dimensions. To perform high-dimensional sparse vine copula based regression, we propose two methods. First, we show their superiority regarding computational complexity over the existing methods. Second, we define relevant, irrelevant, and redundant explanatory variables for quantile regression. Then we show our method's power in selecting relevant variables and prediction accuracy in high-dimensional sparse data sets via simulation studies. Next, we apply the proposed methods to the high-dimensional real data, aiming at the genomic prediction of maize traits. Some data-processing and feature extraction steps for the real data are further discussed. Finally, we show the advantage of our methods over linear models in the real data application.

READ FULL TEXT

page 15

page 20

research
03/12/2019

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

Robust clustering of high-dimensional data is an important topic because...
research
11/18/2022

Modular regression

This paper develops a new framework, called modular regression, to utili...
research
07/03/2017

Regression Phalanxes

Tomal et al. (2015) introduced the notion of "phalanxes" in the context ...
research
09/28/2017

Sparse Hierarchical Regression with Polynomials

We present a novel method for exact hierarchical sparse polynomial regre...
research
07/03/2023

Engression: Extrapolation for Nonlinear Regression?

Extrapolation is crucial in many statistical and machine learning applic...
research
03/01/2020

Lebesgue Regression

We propose Lebesgue Regression, a non-parametric high-dimensional regres...
research
02/26/2019

Human-in-the-loop Active Covariance Learning for Improving Prediction in Small Data Sets

Learning predictive models from small high-dimensional data sets is a ke...

Please sign up or login with your details

Forgot password? Click here to reset