Interpretable machine learning: definitions, methods, and applications

01/14/2019
by   W. James Murdoch, et al.
37

Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.

READ FULL TEXT
research
02/20/2020

Interpretability of machine learning based prediction models in healthcare

There is a need of ensuring machine learning models that are interpretab...
research
11/17/2020

Impact of Accuracy on Model Interpretations

Model interpretations are often used in practice to extract real world i...
research
05/23/2023

Interpretation of Time-Series Deep Models: A Survey

Deep learning models developed for time-series associated tasks have bec...
research
07/08/2020

Pitfalls to Avoid when Interpreting Machine Learning Models

Modern requirements for machine learning (ML) models include both high p...
research
04/08/2019

Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model Agnostic Interpretations

Non-linear machine learning models often trade off a great predictive pe...
research
12/07/2022

Truthful Meta-Explanations for Local Interpretability of Machine Learning Models

Automated Machine Learning-based systems' integration into a wide range ...
research
03/02/2021

Have We Learned to Explain?: How Interpretability Methods Can Learn to Encode Predictions in their Interpretations

While the need for interpretable machine learning has been established, ...

Please sign up or login with your details

Forgot password? Click here to reset