Model Interpretability through the Lens of Computational Complexity

10/23/2020
by   Pablo Barceló, et al.
0

In spite of several claims stating that some models are more interpretable than others – e.g., "linear models are more interpretable than deep neural networks" – we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class 𝒞_1 of models is more interpretable than another class 𝒞_2, if the computational complexity of answering post-hoc queries for models in 𝒞_2 is higher than for those in 𝒞_1. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P≠NP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2016

The Mythos of Model Interpretability

Supervised machine learning models boast remarkable predictive capabilit...
research
06/29/2018

Posthoc Interpretability of Learning to Rank Models using Secondary Training Data

Predictive models are omnipresent in automated and assisted decision mak...
research
10/05/2021

Foundations of Symbolic Languages for Model Interpretability

Several queries and scores have recently been proposed to explain indivi...
research
09/14/2023

Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning

Focus in Explainable AI is shifting from explanations defined in terms o...
research
12/27/2022

On the Equivalence of the Weighted Tsetlin Machine and the Perceptron

Tsetlin Machine (TM) has been gaining popularity as an inherently interp...
research
06/04/2023

(Un)reasonable Allure of Ante-hoc Interpretability for High-stakes Domains: Transparency Is Necessary but Insufficient for Explainability

Ante-hoc interpretability has become the holy grail of explainable machi...
research
06/09/2017

TIP: Typifying the Interpretability of Procedures

We provide a novel notion of what it means to be interpretable, looking ...

Please sign up or login with your details

Forgot password? Click here to reset