Regression Phalanxes

07/03/2017
by   Hongyang Zhang, et al.
0

Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensional data sets using hierarchical clustering and builds a prediction model for each phalanx for further ensembling. Through extensive simulation studies and several real-life applications in various areas (including drug discovery, chemical analysis of spectra data, microarray analysis and climate projections) we show that an ensemble of Regression Phalanxes improves prediction accuracy when combined with effective prediction methods like Lasso or Random Forests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2022

High-dimensional sparse vine copula regression with application to genomic prediction

High-dimensional data sets are often available in genome-enabled predict...
research
03/20/2013

Ensembling classification models based on phalanxes of variables with applications in drug discovery

Statistical detection of a rare class of objects in a two-class classifi...
research
04/15/2022

Accurate ADMET Prediction with XGBoost

The absorption, distribution, metabolism, excretion, and toxicity (ADMET...
research
11/18/2022

Modular regression

This paper develops a new framework, called modular regression, to utili...
research
11/25/2019

Random projections: data perturbation for classification problems

Random projections offer an appealing and flexible approach to a wide ra...
research
03/01/2020

Lebesgue Regression

We propose Lebesgue Regression, a non-parametric high-dimensional regres...
research
12/17/2019

Kernel-Based Ensemble Learning in Python

We propose a new supervised learning algorithm, for classification and r...

Please sign up or login with your details

Forgot password? Click here to reset