Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

02/23/2022
by   Stephan Wäldchen, et al.
4

One of the goals of Explainable AI (XAI) is to determine which input components were relevant for a classifier decision. This is commonly know as saliency attribution. Characteristic functions (from cooperative game theory) are able to evaluate partial inputs and form the basis for theoretically "fair" attribution methods like Shapley values. Given only a standard classifier function, it is unclear how partial input should be realised. Instead, most XAI-methods for black-box classifiers like neural networks consider counterfactual inputs that generally lie off-manifold. This makes them hard to evaluate and easy to manipulate. We propose a setup to directly train characteristic functions in the form of neural networks to play simple two-player games. We apply this to the game of Connect Four by randomly hiding colour information from our agents during training. This has three advantages for comparing XAI-methods: It alleviates the ambiguity about how to realise partial input, makes off-manifold evaluation unnecessary and allows us to compare the methods by letting them play against each other.

READ FULL TEXT

page 3

page 9

page 10

page 11

page 18

page 19

research
03/29/2023

Player-centered incomplete cooperative games

The computation of a solution concept of a cooperative game usually empl...
research
10/16/2020

Evaluating Attribution Methods using White-Box LSTMs

Interpretability methods for neural networks are difficult to evaluate b...
research
05/04/2023

Distributing Synergy Functions: Unifying Game-Theoretic Interaction Methods for Machine-Learning Explainability

Deep learning has revolutionized many areas of machine learning, from co...
research
09/02/2021

Integrated Directional Gradients: Feature Interaction Attribution for Neural NLP Models

In this paper, we introduce Integrated Directional Gradients (IDG), a me...
research
05/30/2022

CHALLENGER: Training with Attribution Maps

We show that utilizing attribution maps for training neural networks can...
research
07/17/2019

A substitute for the classical Neumann–Morgenstern characteristic function in cooperative differential games

In this paper, we present a systematic overview of different endogenous ...
research
09/30/2014

An agent-driven semantical identifier using radial basis neural networks and reinforcement learning

Due to the huge availability of documents in digital form, and the decep...

Please sign up or login with your details

Forgot password? Click here to reset