BIM: Towards Quantitative Evaluation of Interpretability Methods with Ground Truth

07/23/2019
by   Mengjiao Yang, et al.
4

Interpretability is rising as an important area of research in machine learning for safer deployment of machine learning systems. Despite active developments, quantitative evaluation of interpretability methods remains a challenge due to the lack of ground truth; we do not know which features or concepts are important to a classification model. In this work, we propose the Benchmark Interpretability Methods (BIM) framework, which offers a set of tools to quantitatively compare a model's ground truth to the output of interpretability methods. Our contributions are: 1) a carefully crafted dataset and models trained with known ground truth and 2) three complementary metrics to evaluate interpretability methods. Our metrics focus on identifying false positives---features that are incorrectly attributed as important. These metrics compare how methods perform across models, across images, and per image. We open source the dataset, models, and metrics evaluated on many widely-used interpretability methods.

READ FULL TEXT

page 4

page 6

page 7

page 8

page 13

page 15

page 16

page 17

research
10/26/2022

The Inconvenient Truths of Ground Truth for Binary Analysis

The effectiveness of binary analysis tools and techniques is often measu...
research
12/24/2020

QUACKIE: A NLP Classification Task With Ground Truth Explanations

NLP Interpretability aims to increase trust in model predictions. This m...
research
05/20/2022

Constructive Interpretability with CoLabel: Corroborative Integration, Complementary Features, and Collaborative Learning

Machine learning models with explainable predictions are increasingly so...
research
01/11/2023

The Berkelmans-Pries Feature Importance Method: A Generic Measure of Informativeness of Features

Over the past few years, the use of machine learning models has emerged ...
research
01/12/2023

Tracr: Compiled Transformers as a Laboratory for Interpretability

Interpretability research aims to build tools for understanding machine ...
research
08/06/2023

Precise Benchmarking of Explainable AI Attribution Methods

The rationale behind a deep learning model's output is often difficult t...
research
02/12/2022

Detecting False Alarms from Automatic Static Analysis Tools: How Far are We?

Automatic static analysis tools (ASATs), such as Findbugs, have a high f...

Please sign up or login with your details

Forgot password? Click here to reset