On multivariate randomized classification trees: l_0-based sparsity, VC dimension and decomposition methods

12/09/2021
by   Edoardo Amaldi, et al.
0

Decision trees are widely-used classification and regression models because of their interpretability and good accuracy. Classical methods such as CART are based on greedy approaches but a growing attention has recently been devoted to optimal decision trees. We investigate the nonlinear continuous optimization formulation proposed in Blanquero et al. (EJOR, vol. 284, 2020; COR, vol. 132, 2021) for (sparse) optimal randomized classification trees. Sparsity is important not only for feature selection but also to improve interpretability. We first consider alternative methods to sparsify such trees based on concave approximations of the l_0 “norm". Promising results are obtained on 24 datasets in comparison with l_1 and l_∞ regularizations. Then, we derive bounds on the VC dimension of multivariate randomized classification trees. Finally, since training is computationally challenging for large datasets, we propose a general decomposition scheme and an efficient version of it. Experiments on larger datasets show that the proposed decomposition method is able to significantly reduce the training times without compromising the accuracy.

READ FULL TEXT

page 16

page 37

page 38

page 39

page 40

page 41

page 42

research
02/21/2020

Sparsity in Optimal Randomized Classification Trees

Decision trees are popular Classification and Regression tools and, when...
research
06/23/2022

Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features

Decision trees are one of the most useful and popular methods in the mac...
research
10/19/2021

Optimal randomized classification trees

Classification and Regression Trees (CARTs) are off-the-shelf techniques...
research
10/19/2022

Margin Optimal Classification Trees

In recent years there has been growing attention to interpretable machin...
research
10/26/2021

Learning Optimal Decision Trees Using MaxSAT

We present a Combinatorial Optimization approach based on Maximum Satisf...
research
03/29/2021

Strong Optimal Classification Trees

Decision trees are among the most popular machine learning models and ar...
research
10/13/2022

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

Sparse decision trees are one of the most common forms of interpretable ...

Please sign up or login with your details

Forgot password? Click here to reset