Interpreting Black Box Predictions using Fisher Kernels

10/23/2018
by   Rajiv Khanna, et al.
5

Research in both machine learning and psychology suggests that salient examples can help humans to interpret learning models. To this end, we take a novel look at black box interpretation of test predictions in terms of training examples. Our goal is to ask `which training examples are most responsible for a given set of predictions'? To answer this question, we make use of Fisher kernels as the defining feature embedding of each data point, combined with Sequential Bayesian Quadrature (SBQ) for efficient selection of examples. In contrast to prior work, our method is able to seamlessly handle any sized subset of test predictions in a principled way. We theoretically analyze our approach, providing novel convergence bounds for SBQ over discrete candidate atoms. Our approach recovers the application of influence functions for interpretability as a special case yielding novel insights from this connection. We also present applications of the proposed approach to three use cases: cleaning training data, fixing mislabeled examples and data summarization.

READ FULL TEXT
research
09/10/2020

Actionable Interpretation of Machine Learning Models for Sequential Data: Dementia-related Agitation Use Case

Machine learning has shown successes for complex learning problems in wh...
research
03/14/2017

Understanding Black-box Predictions via Influence Functions

How can we explain the predictions of a black-box model? In this paper, ...
research
10/12/2020

Explaining Neural Matrix Factorization with Gradient Rollback

Explaining the predictions of neural black-box models is an important pr...
research
02/09/2021

Demystifying Code Summarization Models

The last decade has witnessed a rapid advance in machine learning models...
research
02/11/2018

Global Model Interpretation via Recursive Partitioning

In this work, we propose a simple but effective method to interpret blac...
research
02/09/2023

Symbolic Metamodels for Interpreting Black-boxes Using Primitive Functions

One approach for interpreting black-box machine learning models is to fi...
research
05/11/2014

Learning from networked examples

Many machine learning algorithms are based on the assumption that traini...

Please sign up or login with your details

Forgot password? Click here to reset