Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs

04/03/2022
by   Mustafa Mert Çelikok, et al.
0

Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human. To do so, the AI must be able to recognize the goals and constraints of the human and have the means to help them. We present a novel formulation of the interaction between the human and the AI as a sequential game where the agents are modelled using Bayesian best-response models. We show that in this case the AI's problem of helping bounded-rational humans make better decisions reduces to a Bayes-adaptive POMDP. In our simulated experiments, we consider an instantiation of our framework for humans who are subjectively optimistic about the AI's future behaviour. Our results show that when equipped with a model of the human, the AI can infer the human's bounds and nudge them towards better decisions. We discuss ways in which the machine can learn to improve upon its own limitations as well with the help of the human. We identify a novel trade-off for centaurs in partially observable tasks: for the AI's actions to be acceptable to the human, the machine must make sure their beliefs are sufficiently aligned, but aligning beliefs might be costly. We present a preliminary theoretical analysis of this trade-off and its dependence on task structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2022

A Cognitive Framework for Delegation Between Error-Prone AI and Human Agents

With humans interacting with AI-based systems at an increasing rate, it ...
research
02/28/2022

The dangers in algorithms learning humans' values and irrationalities

For an artificial intelligence (AI) to be aligned with human values (or ...
research
09/27/2019

Beating humans in a penny-matching game by leveraging cognitive hierarchy theory and Bayesian learning

It is a long-standing goal of artificial intelligence (AI) to be superio...
research
07/14/2021

Do Humans Trust Advice More if it Comes from AI? An Analysis of Human-AI Interactions

In many applications of AI, the algorithm's output is framed as a sugges...
research
10/11/2022

Human-AI Coordination via Human-Regularized Search and Learning

We consider the problem of making AI agents that collaborate well with h...
research
04/14/2022

Should I Follow AI-based Advice? Measuring Appropriate Reliance in Human-AI Decision-Making

Many important decisions in daily life are made with the help of advisor...
research
09/14/2017

Towards Cognitive-and-Immersive Systems: Experiments in a Shared (or common) Blockworld Framework

As computational power has continued to increase, and sensors have becom...

Please sign up or login with your details

Forgot password? Click here to reset