Consistency of The Oblique Decision Tree and Its Random Forest

11/23/2022
by   Haoran Zhan, et al.
0

The classification and regression tree (CART) and Random Forest (RF) are arguably the most popular pair of statistical learning methods. However, their statistical consistency can only be proved under very restrictive assumption on the underlying regression function. As an extension of the standard CART, Breiman (1984) suggested using linear combinations of predictors as splitting variables. The method became known as the oblique decision tree (ODT) and has received lots of attention. ODT tends to perform better than CART and requires fewer partitions. In this paper, we further show that ODT is consistent for very general regression functions as long as they are continuous. We also prove the consistency of ODT-based random forests (ODRF) that uses either fixed-size or random-size subset of features in the features bagging, the latter of which is also guaranteed to be consistent for general regression functions, but the former is consistent only for functions with specific structures. After refining the existing computer packages according to the established theory, our numerical experiments also show that ODRF has a noticeable overall improvement over RF and other decision forests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2019

Multinomial Random Forests: Fill the Gap between Theoretical Consistency and Empirical Soundness

Random forests (RF) are one of the most widely used ensemble learning me...
research
07/05/2022

An Approximation Method for Fitted Random Forests

Random Forests (RF) is a popular machine learning method for classificat...
research
10/26/2022

Ensemble Projection Pursuit for General Nonparametric Regression

The projection pursuit regression (PPR) has played an important role in ...
research
07/30/2020

Random Forests for dependent data

Random forest (RF) is one of the most popular methods for estimating reg...
research
04/23/2019

Regression-Enhanced Random Forests

Random forest (RF) methodology is one of the most popular machine learni...
research
02/08/2022

Is interpolation benign for random forests?

Statistical wisdom suggests that very complex models, interpolating trai...
research
06/25/2019

AMF: Aggregated Mondrian Forests for Online Learning

Random Forests (RF) is one of the algorithms of choice in many supervise...

Please sign up or login with your details

Forgot password? Click here to reset