Margin Optimal Classification Trees

10/19/2022
by   Federico D'Onofrio, et al.
0

In recent years there has been growing attention to interpretable machine learning models which can give explanatory insights on their behavior. Thanks to their interpretability, decision trees have been intensively studied for classification tasks, and due to the remarkable advances in mixed-integer programming (MIP), various approaches have been proposed to formulate the problem of training an Optimal Classification Tree (OCT) as a MIP model. We present a novel mixed-integer quadratic formulation for the OCT problem, which exploits the generalization capabilities of Support Vector Machines for binary classification. Our model, denoted as Margin Optimal Classification Tree (MARGOT), encompasses the use of maximum margin multivariate hyperplanes nested in a binary tree structure. To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing local sparsity of the hyperplanes. First, MARGOT has been tested on non-linearly separable synthetic datasets in 2-dimensional feature space to provide a graphical representation of the maximum margin approach. Finally, the proposed models have been tested on benchmark datasets from the UCI repository. The MARGOT formulation turns out to be easier to solve than other OCT approaches, and the generated tree better generalizes on new observations. The two interpretable versions are effective in selecting the most relevant features and maintaining good prediction quality.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2023

Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

The interpretability of models has become a crucial issue in Machine Lea...
research
12/01/2021

Training Experimentally Robust and Interpretable Binarized Regression Models Using Mixed-Integer Programming

In this paper, we explore model-based approach to training robust and in...
research
06/01/2023

Loss-Optimal Classification Trees: A Generalized Framework and the Logistic Case

The Classification Tree (CT) is one of the most common models in interpr...
research
06/10/2022

Mixed integer linear optimization formulations for learning optimal binary classification trees

Decision trees are powerful tools for classification and regression that...
research
12/15/2020

Robust Optimal Classification Trees under Noisy Labels

In this paper we propose a novel methodology to construct Optimal Classi...
research
12/09/2021

On multivariate randomized classification trees: l_0-based sparsity, VC dimension and decomposition methods

Decision trees are widely-used classification and regression models beca...
research
12/03/2018

Interpretable Clustering via Optimal Trees

State-of-the-art clustering algorithms use heuristics to partition the f...

Please sign up or login with your details

Forgot password? Click here to reset