Strong Optimal Classification Trees

03/29/2021
by   Sina Aghaei, et al.
19

Decision trees are among the most popular machine learning models and are used routinely in applications ranging from revenue management and medicine to bioinformatics. In this paper, we consider the problem of learning optimal binary classification trees. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer optimization (MIO) technology. Yet, existing MIO-based approaches from the literature do not leverage the power of MIO to its full extent: they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose an intuitive flow-based MIO formulation for learning optimal binary classification trees. Our formulation can accommodate side constraints to enable the design of interpretable and fair decision trees. Moreover, we show that our formulation has a stronger linear optimization relaxation than existing methods. We exploit the decomposable structure of our formulation and max-flow/min-cut duality to derive a Benders' decomposition method to speed-up computation. We propose a tailored procedure for solving each decomposed subproblem that provably generates facets of the feasible set of the MIO as constraints to add to the main problem. We conduct extensive computational experiments on standard benchmark datasets on which we show that our proposed approaches are 31 times faster than state-of-the art MIO-based techniques and improve out of sample performance by up to 8

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2020

Learning Optimal Classification Trees: Strong Max-Flow Formulations

We consider the problem of learning optimal binary classification trees....
research
06/10/2022

Mixed integer linear optimization formulations for learning optimal binary classification trees

Decision trees are powerful tools for classification and regression that...
research
04/21/2023

Rolling Lookahead Learning for Optimal Classification Trees

Classification trees continue to be widely adopted in machine learning a...
research
08/31/2021

Learning Optimal Prescriptive Trees from Observational Data

We consider the problem of learning an optimal prescriptive tree (i.e., ...
research
02/14/2023

Scalable Optimal Multiway-Split Decision Trees with Constraints

There has been a surge of interest in learning optimal decision trees us...
research
09/08/2021

Robust Optimal Classification Trees Against Adversarial Examples

Decision trees are a popular choice of explainable model, but just like ...
research
12/09/2021

On multivariate randomized classification trees: l_0-based sparsity, VC dimension and decomposition methods

Decision trees are widely-used classification and regression models beca...

Please sign up or login with your details

Forgot password? Click here to reset