PLUTO: Penalized Unbiased Logistic Regression Trees

11/25/2014
by   Wenwen Zhang, et al.
0

We propose a new algorithm called PLUTO for building logistic regression trees to binary response data. PLUTO can capture the nonlinear and interaction patterns in messy data by recursively partitioning the sample space. It fits a simple or a multiple linear logistic regression model in each partition. PLUTO employs the cyclical coordinate descent method for estimation of multiple linear logistic regression models with elastic net penalties, which allows it to deal with high-dimensional data efficiently. The tree structure comprises a graphical description of the data. Together with the logistic regression models, it provides an accurate classifier as well as a piecewise smooth estimate of the probability of "success". PLUTO controls selection bias by: (1) separating split variable selection from split point selection; (2) applying an adjusted chi-squared test to find the split variable instead of exhaustive search. A bootstrap calibration technique is employed to further correct selection bias. Comparison on real datasets shows that on average, the multiple linear PLUTO models predict more accurately than other algorithms.

READ FULL TEXT

page 27

page 37

page 39

research
09/07/2021

SIHR: An R Package for Statistical Inference in High-dimensional Linear and Logistic Regression Models

We introduce and illustrate through numerical examples the R package wh...
research
08/01/2020

Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome

Building classification models that predict a binary class label on the ...
research
07/03/2020

On Second order correctness of Bootstrap in Logistic Regression

In the fields of clinical trials, biomedical surveys, marketing, banking...
research
06/29/2022

Variable selection in high-dimensional logistic regression models using a whitening approach

In bioinformatics, the rapid development of sequencing technology has en...
research
11/28/2019

Qini-based Uplift Regression

Uplift models provide a solution to the problem of isolating the marketi...
research
11/26/2022

Multiple imputation for logistic regression models: incorporating an interaction

Background: Multiple imputation is often used to reduce bias and gain ef...
research
12/30/2013

A Fused Elastic Net Logistic Regression Model for Multi-Task Binary Classification

Multi-task learning has shown to significantly enhance the performance o...

Please sign up or login with your details

Forgot password? Click here to reset