Understanding Post-hoc Explainers: The Case of Anchors

03/15/2023
by   Gianluigi Lopardo, et al.
9

In many scenarios, the interpretability of machine learning models is a highly required but difficult task. To explain the individual predictions of such models, local model-agnostic approaches have been proposed. However, the process generating the explanations can be, for a user, as mysterious as the prediction to be explained. Furthermore, interpretability methods frequently lack theoretical guarantees, and their behavior on simple models is frequently unknown. While it is difficult, if not impossible, to ensure that an explainer behaves as expected on a cutting-edge model, we can at least ensure that everything works on simple, already interpretable models. In this paper, we present a theoretical analysis of Anchors (Ribeiro et al., 2018): a popular rule-based interpretability method that highlights a small set of words to explain a text classifier's decision. After formalizing its algorithm and providing useful insights, we demonstrate mathematically that Anchors produces meaningful results when used with linear text classifiers on top of a TF-IDF vectorization. We believe that our analysis framework can aid in the development of new explainability methods based on solid theoretical foundations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2022

A Sea of Words: An In-Depth Analysis of Anchors for Text Data

Anchors [Ribeiro et al. (2018)] is a post-hoc, rule-based interpretabili...
research
10/23/2020

An Analysis of LIME for Text Data

Text data are increasingly handled in an automated fashion by machine le...
research
11/16/2021

SMACE: A New Method for the Interpretability of Composite Decision Systems

Interpretability is a pressing issue for decision systems. Many post hoc...
research
01/10/2020

Explaining the Explainer: A First Theoretical Analysis of LIME

Machine learning is used more and more often for sensitive applications,...
research
12/22/2017

Inverse Classification for Comparison-based Interpretability in Machine Learning

In the context of post-hoc interpretability, this paper addresses the ta...
research
04/25/2017

A relevance-scalability-interpretability tradeoff with temporally evolving user personas

The current work characterizes the users of a VoD streaming space throug...
research
08/25/2020

Looking deeper into LIME

Interpretability of machine learning algorithm is a pressing need. Numer...

Please sign up or login with your details

Forgot password? Click here to reset