How Smart Guessing Strategies Can Yield Massive Scalability Improvements for Sparse Decision Tree Optimization

12/01/2021
by   Hayden McTavish, et al.
3

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have only been made on the problem within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude, while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.

READ FULL TEXT
research
06/15/2020

Generalized Optimal Sparse Decision Trees

Decision tree optimization is notoriously difficult from a computational...
research
04/07/2021

Sparse Oblique Decision Trees: A Tool to Understand and Manipulate Neural Net Features

The widespread deployment of deep nets in practical applications has lea...
research
03/10/2020

ENTMOOT: A Framework for Optimization over Ensemble Tree Models

Gradient boosted trees and other regression tree models perform well in ...
research
10/13/2022

Fast Optimization of Weighted Sparse Decision Trees for use in Optimal Treatment Regimes and Optimal Policy Design

Sparse decision trees are one of the most common forms of interpretable ...
research
06/09/2022

Distillation Decision Tree

Black-box machine learning models are criticized as lacking interpretabi...
research
10/07/2021

Coresets for Decision Trees of Signals

A k-decision tree t (or k-tree) is a recursive partition of a matrix (2D...
research
06/17/2019

Learning Interpretable Models Using an Oracle

As Machine Learning (ML) becomes pervasive in various real world systems...

Please sign up or login with your details

Forgot password? Click here to reset