Correlated Feature Selection with Extended Exclusive Group Lasso

02/27/2020
by   Yuxin Sun, et al.
9

In many high dimensional classification or regression problems set in a biological context, the complete identification of the set of informative features is often as important as predictive accuracy, since this can provide mechanistic insight and conceptual understanding. Lasso and related algorithms have been widely used since their sparse solutions naturally identify a set of informative features. However, Lasso performs erratically when features are correlated. This limits the use of such algorithms in biological problems, where features such as genes often work together in pathways, leading to sets of highly correlated features. In this paper, we examine the performance of a Lasso derivative, the exclusive group Lasso, in this setting. We propose fast algorithms to solve the exclusive group Lasso, and introduce a solution to the case when the underlying group structure is unknown. The solution combines stability selection with random group allocation and introduction of artificial features. Experiments with both synthetic and real-world data highlight the advantages of this proposed methodology over Lasso in comprehensive selection of informative features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

ControlBurn: Feature Selection by Sparse Forests

Tree ensembles distribute feature importance evenly amongst groups of co...
research
01/03/2022

Cluster Stability Selection

Stability selection (Meinshausen and Buhlmann, 2010) makes any feature s...
research
06/03/2022

Multivariate Sparse Group Lasso Joint Model for Radiogenomics Data

Radiogenomics is an emerging field in cancer research that combines medi...
research
02/09/2020

Learning High Order Feature Interactions with Fine Control Kernels

We provide a methodology for learning sparse statistical models that use...
research
12/30/2018

Allocation strategies for high fidelity models in the multifidelity regime

We propose a novel approach to allocating resources for expensive simula...
research
03/25/2015

Stable Feature Selection from Brain sMRI

Neuroimage analysis usually involves learning thousands or even millions...
research
06/27/2012

Discovering Support and Affiliated Features from Very High Dimensions

In this paper, a novel learning paradigm is presented to automatically i...

Please sign up or login with your details

Forgot password? Click here to reset