DeepAI AI Chat
Log In Sign Up

Explainability Via Causal Self-Talk

by   Nicholas A. Roy, et al.

Explaining the behavior of AI systems is an important problem that, in practice, is generally avoided. While the XAI community has been developing an abundance of techniques, most incur a set of costs that the wider deep learning community has been unwilling to pay in most situations. We take a pragmatic view of the issue, and define a set of desiderata that capture both the ambitions of XAI and the practical constraints of deep learning. We describe an effective way to satisfy all the desiderata: train the AI system to build a causal model of itself. We develop an instance of this solution for Deep RL agents: Causal Self-Talk. CST operates by training the agent to communicate with itself across time. We implement this method in a simulated 3D environment, and show how it enables agents to generate faithful and semantically-meaningful explanations of their own behavior. Beyond explanations, we also demonstrate that these learned models provide new ways of building semantic control interfaces to AI systems.


page 7

page 8

page 9


Tell me why! – Explanations support learning of relational and causal structure

Explanations play a considerable role in human learning, especially in a...

Causal Analysis of Agent Behavior for AI Safety

As machine learning systems become more powerful they also become increa...

Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability

Explaining AI systems is fundamental both to the development of high per...

Experiential Explanations for Reinforcement Learning

Reinforcement Learning (RL) approaches are becoming increasingly popular...

CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships

As machine learning models become increasingly prevalent in motion forec...

Brittle AI, Causal Confusion, and Bad Mental Models: Challenges and Successes in the XAI Program

The advances in artificial intelligence enabled by deep learning archite...