Solving the Baby Intuitions Benchmark with a Hierarchically Bayesian Theory of Mind

08/04/2022
by   Tan Zhi-Xuan, et al.
5

To facilitate the development of new models to bridge the gap between machine and human social intelligence, the recently proposed Baby Intuitions Benchmark (arXiv:2102.11938) provides a suite of tasks designed to evaluate commonsense reasoning about agents' goals and actions that even young infants exhibit. Here we present a principled Bayesian solution to this benchmark, based on a hierarchically Bayesian Theory of Mind (HBToM). By including hierarchical priors on agent goals and dispositions, inference over our HBToM model enables few-shot learning of the efficiency and preferences of an agent, which can then be used in commonsense plausibility judgements about subsequent agent behavior. This approach achieves near-perfect accuracy on most benchmark tasks, outperforming deep learning and imitation learning baselines while producing interpretable human-like inferences, demonstrating the advantages of structured Bayesian models of human social cognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2021

Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others

To achieve human-like common sense about everyday life, machine learning...
research
06/24/2021

Modeling the Mistakes of Boundedly Rational Agents Within a Bayesian Theory of Mind

When inferring the goals that others are trying to achieve, people intui...
research
09/23/2019

Satisficing Mentalizing: Bayesian Models of Theory of Mind Reasoning in Scenarios with Different Uncertainties

The ability to interpret the mental state of another agent based on its ...
research
02/24/2021

AGENT: A Benchmark for Core Psychological Reasoning

For machine agents to successfully interact with humans in real-world se...
research
03/26/2020

Too many cooks: Bayesian inference for coordinating multi-agent collaboration

Collaboration requires agents to coordinate their behavior on the fly, s...
research
10/14/2022

MiQA: A Benchmark for Inference on Metaphorical Questions

We propose a benchmark to assess the capability of large language models...
research
01/08/2018

Solutions to problems with deep learning

Despite the several successes of deep learning systems, there are concer...

Please sign up or login with your details

Forgot password? Click here to reset