Counterfactual equivalence for POMDPs, and underlying deterministic environments

01/11/2018
by   Stuart Armstrong, et al.
0

Partially Observable Markov Decision Processes (POMDPs) are rich environments often used in machine learning. But the issue of information and causal structures in POMDPs has been relatively little studied. This paper presents the concepts of equivalent and counterfactually equivalent POMDPs, where agents cannot distinguish which environment they are in though any observations and actions. It shows that any POMDP is counterfactually equivalent, for any finite number of turns, to a deterministic POMDP with all uncertainty concentrated into the initial state. This allows a better understanding of POMDP uncertainty, information, and learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2019

Learning Causal State Representations of Partially Observable Environments

Intelligent agents can cope with sensory-rich environments by learning t...
research
11/20/2017

Is prioritized sweeping the better episodic control?

Episodic control has been proposed as a third approach to reinforcement ...
research
01/23/2013

A Possibilistic Model for Qualitative Sequential Decision Problems under Uncertainty in Partially Observable Environments

In this article we propose a qualitative (ordinal) counterpart for the P...
research
07/16/2022

ChronosPerseus: Randomized Point-based Value Iteration with Importance Sampling for POSMDPs

In reinforcement learning, agents have successfully used environments mo...
research
03/14/2023

Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active Measuring

We study Markov decision processes (MDPs), where agents have direct cont...
research
10/18/2021

Lifting DecPOMDPs for Nanoscale Systems – A Work in Progress

DNA-based nanonetworks have a wide range of promising use cases, especia...
research
02/14/2022

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization

In the sequential decision making setting, an agent aims to achieve syst...

Please sign up or login with your details

Forgot password? Click here to reset