Deep Interpretable Models of Theory of Mind For Human-Agent Teaming

04/07/2021
by   Ini Oguntola, et al.
2

When developing AI systems that interact with humans, it is essential to design both a system that can understand humans, and a system that humans can understand. Most deep network based agent-modeling approaches are 1) not interpretable and 2) only model external behavior, ignoring internal mental states, which potentially limits their capability for assistance, interventions, discovering false beliefs, etc. To this end, we develop an interpretable modular neural framework for modeling the intentions of other observed entities. We demonstrate the efficacy of our approach with experiments on data from human participants on a search and rescue task in Minecraft, and show that incorporating interpretability can significantly increase predictive performance under the right conditions.

READ FULL TEXT
02/21/2018

Machine Theory of Mind

Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans'...
09/28/2022

Mathematical Models of Theory of Mind

Socially assistive robots provide physical and mental assistance for hum...
09/14/2017

Towards Cognitive-and-Immersive Systems: Experiments in a Shared (or common) Blockworld Framework

As computational power has continued to increase, and sensors have becom...
09/04/2022

Do Large Language Models know what humans know?

Humans can attribute mental states to others, a capacity known as Theory...
04/03/2017

It Takes Two to Tango: Towards Theory of AI's Mind

Theory of Mind is the ability to attribute mental states (beliefs, inten...
06/26/2018

Theory of Machine Networks: A Case Study

We propose a simplification of the Theory-of-Mind Network architecture, ...
04/21/2021

A Unifying Bayesian Formulation of Measures of Interpretability in Human-AI

Existing approaches for generating human-aware agent behaviors have cons...