Distributing Synergy Functions: Unifying Game-Theoretic Interaction Methods for Machine-Learning Explainability

05/04/2023
by   Daniel Lundstrom, et al.
0

Deep learning has revolutionized many areas of machine learning, from computer vision to natural language processing, but these high-performance models are generally "black box." Explaining such models would improve transparency and trust in AI-powered decision making and is necessary for understanding other practical needs such as robustness and fairness. A popular means of enhancing model transparency is to quantify how individual inputs contribute to model outputs (called attributions) and the magnitude of interactions between groups of inputs. A growing number of these methods import concepts and results from game theory to produce attributions and interactions. This work presents a unifying framework for game-theory-inspired attribution and k^th-order interaction methods. We show that, given modest assumptions, a unique full account of interactions between features, called synergies, is possible in the continuous input setting. We identify how various methods are characterized by their policy of distributing synergies. We also demonstrate that gradient-based methods are characterized by their actions on monomials, a type of synergy function, and introduce unique gradient-based methods. We show that the combination of various criteria uniquely defines the attribution/interaction methods. Thus, the community needs to identify goals and contexts when developing and employing attribution and interaction methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2022

Attribution-based XAI Methods in Computer Vision: A Review

The advancements in deep learning-based methods for visual perception ta...
research
03/22/2021

Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles

A vastly growing literature on explaining deep learning models has emerg...
research
07/23/2021

Robust Explainability: A Tutorial on Gradient-Based Attribution Methods for Deep Neural Networks

With the rise of deep neural networks, the challenge of explaining the p...
research
02/23/2022

Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

One of the goals of Explainable AI (XAI) is to determine which input com...
research
02/08/2022

Time to Focus: A Comprehensive Benchmark Using Time Series Attribution Methods

In the last decade neural network have made huge impact both in industry...
research
07/18/2023

Gradient strikes back: How filtering out high frequencies improves explanations

Recent years have witnessed an explosion in the development of novel pre...
research
07/15/2022

Anomalous behaviour in loss-gradient based interpretability methods

Loss-gradients are used to interpret the decision making process of deep...

Please sign up or login with your details

Forgot password? Click here to reset