Understanding Interpretability by generalized distillation in Supervised Classification

12/05/2020
by   Adit Agarwal, et al.
0

The ability to interpret decisions taken by Machine Learning (ML) models is fundamental to encourage trust and reliability in different practical applications. Recent interpretation strategies focus on human understanding of the underlying decision mechanisms of the complex ML models. However, these strategies are restricted by the subjective biases of humans. To dissociate from such human biases, we propose an interpretation-by-distillation formulation that is defined relative to other ML models. We generalize the distillation technique for quantifying interpretability, using an information-theoretic perspective, removing the role of ground-truth from the definition of interpretability. Our work defines the entropy of supervised classification models, providing bounds on the entropy of Piece-Wise Linear Neural Networks (PWLNs), along with the first theoretical bounds on the interpretability of PWLNs. We evaluate our proposed framework on the MNIST, Fashion-MNIST and Stanford40 datasets and demonstrate the applicability of the proposed theoretical framework in different supervised classification scenarios.

READ FULL TEXT

page 3

page 6

research
01/20/2019

Quantifying Interpretability and Trust in Machine Learning Systems

Decisions by Machine Learning (ML) models have become ubiquitous. Trusti...
research
06/02/2020

Local Interpretability of Calibrated Prediction Models: A Case of Type 2 Diabetes Mellitus Screening Test

Machine Learning (ML) models are often complex and difficult to interpre...
research
06/12/2022

A Functional Information Perspective on Model Interpretation

Contemporary predictive models are hard to interpret as their deep nets ...
research
07/08/2019

The Price of Interpretability

When quantitative models are used to support decision-making on complex ...
research
05/24/2022

Interpretation Quality Score for Measuring the Quality of interpretability methods

Machine learning (ML) models have been applied to a wide range of natura...
research
04/22/2022

A Unifying Framework for Combining Complementary Strengths of Humans and ML toward Better Predictive Decision-Making

Hybrid human-ML systems are increasingly in charge of consequential deci...
research
04/14/2022

Interpretability of Machine Learning Methods Applied to Neuroimaging

Deep learning methods have become very popular for the processing of nat...

Please sign up or login with your details

Forgot password? Click here to reset