The Definitions of Interpretability and Learning of Interpretable Models

05/29/2021
by   Weishen Pan, et al.
0

As machine learning algorithms getting adopted in an ever-increasing number of applications, interpretation has emerged as a crucial desideratum. In this paper, we propose a mathematical definition for the human-interpretable model. In particular, we define interpretability between two information process systems. If a prediction model is interpretable by a human recognition system based on the above interpretability definition, the prediction model is defined as a completely human-interpretable model. We further design a practical framework to train a completely human-interpretable model by user interactions. Experiments on image datasets show the advantages of our proposed model in two aspects: 1) The completely human-interpretable model can provide an entire decision-making process that is human-understandable; 2) The completely human-interpretable model is more robust against adversarial attacks.

READ FULL TEXT

page 12

page 14

research
07/08/2019

The Price of Interpretability

When quantitative models are used to support decision-making on complex ...
research
09/13/2019

A Double Penalty Model for Interpretability

Modern statistical learning techniques have often emphasized prediction ...
research
07/12/2017

A Formal Framework to Characterize Interpretability of Procedures

We provide a novel notion of what it means to be interpretable, looking ...
research
05/24/2023

SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language Representations

Although deep language representations have become the dominant form of ...
research
06/09/2017

TIP: Typifying the Interpretability of Procedures

We provide a novel notion of what it means to be interpretable, looking ...
research
04/07/2020

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

With the growing popularity of deep-learning based NLP models, comes a n...
research
08/28/2022

IDP-PGFE: An Interpretable Disruption Predictor based on Physics-Guided Feature Extraction

Disruption prediction has made rapid progress in recent years, especiall...

Please sign up or login with your details

Forgot password? Click here to reset