Group Model Selection Using Marginal Correlations: The Good, the Bad and the Ugly

10/08/2012
by   Waheed U. Bajwa, et al.
0

Group model selection is the problem of determining a small subset of groups of predictors (e.g., the expression data of genes) that are responsible for majority of the variation in a response variable (e.g., the malignancy of a tumor). This paper focuses on group model selection in high-dimensional linear models, in which the number of predictors far exceeds the number of samples of the response variable. Existing works on high-dimensional group model selection either require the number of samples of the response variable to be significantly larger than the total number of predictors contributing to the response or impose restrictive statistical priors on the predictors and/or nonzero regression coefficients. This paper provides comprehensive understanding of a low-complexity approach to group model selection that avoids some of these limitations. The proposed approach, termed Group Thresholding (GroTh), is based on thresholding of marginal correlations of groups of predictors with the response variable and is reminiscent of existing thresholding-based approaches in the literature. The most important contribution of the paper in this regard is relating the performance of GroTh to a polynomial-time verifiable property of the predictors for the general case of arbitrary (random or deterministic) predictors and arbitrary nonzero regression coefficients.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2021

The EAS approach to variable selection for multivariate response data in high-dimensional settings

In this paper, we extend the epsilon admissible subsets (EAS) model sele...
research
06/19/2023

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

Effective machine learning models learn both robust features that direct...
research
06/10/2019

Selection consistency of Lasso-based procedures for misspecified high-dimensional binary model and random regressors

We consider selection of random predictors for high-dimensional regressi...
research
06/29/2020

Data integration in high dimension with multiple quantiles

This article deals with the analysis of high dimensional data that come ...
research
04/07/2020

Model selection in the space of Gaussian models invariant by symmetry

We consider multivariate centred Gaussian models for the random variable...
research
11/24/2020

Identifying important predictors in large data bases – multiple testing and model selection

This is a chapter of the forthcoming Handbook of Multiple Testing. We co...
research
05/16/2023

Sparse-group SLOPE: adaptive bi-level selection with FDR-control

In this manuscript, a new high-dimensional approach for simultaneous var...

Please sign up or login with your details

Forgot password? Click here to reset