A Formal Solution to the Grain of Truth Problem

09/16/2016
by   Jan Leike, et al.
0

A Bayesian agent acting in a multi-agent environment learns to predict the other agents' policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of policies that contains all computable policies as well as Bayes-optimal policies for every lower semicomputable prior over the class. When the environment is unknown, Bayes-optimal agents may fail to act optimally even asymptotically. However, agents based on Thompson sampling converge to play ϵ-Nash equilibria in arbitrary unknown computable multi-agent environments. While these results are purely theoretical, we show that they can be computationally approximated arbitrarily closely.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2016

Nonparametric General Reinforcement Learning

Reinforcement learning (RL) problems are often phrased in terms of Marko...
research
03/04/2019

Strong Asymptotic Optimality in General Environments

Reinforcement Learning agents are expected to eventually perform well. T...
research
06/10/2021

ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in Multi-Agent Simulations

Multi-agent simulations provide a scalable environment for learning poli...
research
09/09/2011

A Framework for Sequential Planning in Multi-Agent Settings

This paper extends the framework of partially observable Markov decision...
research
03/13/2018

Decentralised Learning in Systems with Many, Many Strategic Agents

Although multi-agent reinforcement learning can tackle systems of strate...
research
04/07/2013

A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior

Recent advances in Bayesian reinforcement learning (BRL) have shown that...
research
05/31/2022

Simplex NeuPL: Any-Mixture Bayes-Optimality in Symmetric Zero-sum Games

Learning to play optimally against any mixture over a diverse set of str...

Please sign up or login with your details

Forgot password? Click here to reset