The Bouncer Problem: Challenges to Remote Explainability

10/03/2019
by   Erwan Le Merrer, et al.
0

The concept of explainability is envisioned to satisfy society's demands for transparency on machine learning decisions. The concept is simple: like humans, algorithms should explain the rationale behind their decisions so that their fairness can be assessed. While this approach is promising in a local context (e.g. to explain a model during debugging at training time), we argue that this reasoning cannot simply be transposed in a remote context, where a trained model by a service provider is only accessible through its API. This is problematic as it constitutes precisely the target use-case requiring transparency from a societal perspective. Through an analogy with a club bouncer (which may provide untruthful explanations upon customer reject), we show that providing explanations cannot prevent a remote service from lying about the true reasons leading to its decisions. More precisely, we prove the impossibility of remote explainability for single explanations, by constructing an attack on explanations that hides discriminatory features to the querying user. We provide an example implementation of this attack. We then show that the probability that an observer spots the attack, using several explanations for attempting to find incoherences, is low in practical settings. This undermines the very concept of remote explainability in general.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/13/2019

Explainable Machine Learning in Deployment

Explainable machine learning seeks to provide various stakeholders with ...
research
11/09/2022

On the Robustness of Explanations of Deep Neural Network Models: A Survey

Explainability has been widely stated as a cornerstone of the responsibl...
research
02/06/2023

L'explicabilité au service de l'extraction de connaissances : application à des données médicales

The use of machine learning has increased dramatically in the last decad...
research
06/06/2023

Expanding Explainability Horizons: A Unified Concept-Based System for Local, Global, and Misclassification Explanations

Explainability of intelligent models has been garnering increasing atten...
research
09/05/2023

A Context-Sensitive Approach to XAI in Music Performance

The rapidly evolving field of Explainable Artificial Intelligence (XAI) ...
research
09/07/2023

Automatic Concept Embedding Model (ACEM): No train-time concepts, No issue!

Interpretability and explainability of neural networks is continuously i...
research
01/03/2019

Towards a Framework Combining Machine Ethics and Machine Explainability

We find ourselves surrounded by a rapidly increasing number of autonomou...

Please sign up or login with your details

Forgot password? Click here to reset