On Computing Optimal Tree Ensembles

06/07/2023
by   Christian Komusiewicz, et al.
0

Random forests and, more generally, (decision-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth. We are not aware of such research for tree ensembles and aim to contribute to this area. Mainly, we provide two novel algorithms and corresponding lower bounds. First, we are able to carry over and substantially improve on tractability results for decision trees, obtaining a (6δ D S)^S · poly-time algorithm, where S is the number of cuts in the tree ensemble, D the largest domain size, and δ is the largest number of features in which two examples differ. To achieve this, we introduce the witness-tree technique which also seems promising for practice. Second, we show that dynamic programming, which has been successful for decision trees, may also be viable for tree ensembles, providing an ℓ^n · poly-time algorithm, where ℓ is the number of trees and n the number of examples. Finally, we compare the number of cuts necessary to classify training data sets for decision trees and tree ensembles, showing that ensembles may need exponentially fewer cuts for increasing number of trees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/10/2021

On Learning and Testing Decision Tree

In this paper, we study learning and testing decision tree of size and d...
research
06/10/2019

Robustness Verification of Tree-based Models

We study the robustness verification problem for tree-based models, incl...
research
12/03/2019

Training Robust Tree Ensembles for Security

Tree ensemble models including random forests and gradient boosted decis...
research
10/20/2020

An Eager Splitting Strategy for Online Decision Trees

We study the effectiveness of replacing the split strategy for the state...
research
12/07/2021

Shrub Ensembles for Online Classification

Online learning algorithms have become a ubiquitous tool in the machine ...
research
11/19/2022

On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation

Decision tree learning is increasingly being used for pointwise inferenc...
research
06/16/2022

Explainable Models via Compression of Tree Ensembles

Ensemble models (bagging and gradient-boosting) of relational decision t...

Please sign up or login with your details

Forgot password? Click here to reset