Learning Interpretable Models Using an Oracle

06/17/2019
by   Abhishek Ghose, et al.
7

As Machine Learning (ML) becomes pervasive in various real world systems, the need for models to be interpretable or explainable has increased. We focus on interpretability, noting that models often need to be constrained in size for them to be considered understandable, e.g., a decision tree of depth 5 is easier to interpret than one of depth 50. This suggests a trade-off between interpretability and accuracy. We propose a technique to minimize this tradeoff. Our strategy is to first learn a powerful, possibly black-box, probabilistic model on the data, which we refer to as the oracle. We use this to adaptively sample the training dataset to present data to our model of interest to learn from. Determining the sampling strategy is formulated as an optimization problem that, independent of the dimensionality of the data, uses only seven variables. We empirically show that this often significantly increases the accuracy of our model. Our technique is model agnostic - in that, both the interpretable model and the oracle might come from any model family. Results using multiple real world datasets, using Linear Probability Models and Decision Trees as interpretable models, and Gradient Boosted Model and Random Forest as oracles are presented. Additionally, we discuss an interesting example of using a sentence-embedding based text classifier as an oracle to improve the accuracy of a term-frequency based bag-of-words linear classifier.

READ FULL TEXT

page 2

page 4

research
05/04/2019

Optimal Resampling for Learning Small Models

Models often need to be constrained to a certain size for them to be con...
research
10/08/2022

Accurate Small Models using Adaptive Sampling

We highlight the utility of a certain property of model training: instea...
research
04/07/2022

Using Decision Tree as Local Interpretable Model in Autoencoder-based LIME

Nowadays, deep neural networks are being used in many domains because of...
research
12/01/2021

How Smart Guessing Strategies Can Yield Massive Scalability Improvements for Sparse Decision Tree Optimization

Sparse decision tree optimization has been one of the most fundamental p...
research
03/31/2023

DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

As the complexity of machine learning (ML) models increases and the appl...
research
06/09/2022

There is no Accuracy-Interpretability Tradeoff in Reinforcement Learning for Mazes

Interpretability is an essential building block for trustworthiness in r...
research
08/29/2023

Probabilistic Dataset Reconstruction from Interpretable Models

Interpretability is often pointed out as a key requirement for trustwort...

Please sign up or login with your details

Forgot password? Click here to reset