CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

04/13/2023
by   Ricardo Cannizzaro, et al.
0

Robots operating in real-world environments must reason about possible outcomes of stochastic actions and make decisions based on partial observations of the true world state. A major challenge for making accurate and robust action predictions is the problem of confounding, which if left untreated can lead to prediction errors. The partially observable Markov decision process (POMDP) is a widely-used framework to model these stochastic and partially-observable decision-making problems. However, due to a lack of explicit causal semantics, POMDP planning methods are prone to confounding bias and thus in the presence of unobserved confounders may produce underperforming policies. This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT), a modern anytime online POMDP planner, using causal modelling and inference to eliminate errors caused by unmeasured confounder variables. We further propose a method to learn offline the partial parameterisation of the causal model for planning, from ground truth model data. We evaluate our methods on a toy problem with an unobserved confounder and show that the learned causal model is highly accurate, while our planning method is more robust to confounding and produces overall higher performing policies than AR-DESPOT.

READ FULL TEXT

page 1

page 6

research
12/17/2021

Visual Learning-based Planning for Continuous High-Dimensional POMDPs

The Partially Observable Markov Decision Process (POMDP) is a powerful f...
research
09/09/2019

Off-Policy Evaluation in Partially Observable Environments

This work studies the problem of batch off-policy evaluation for Reinfor...
research
07/24/2017

Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

This paper presents a data-driven approach for multi-robot coordination ...
research
06/22/2021

Algorithmic Recourse in Partially and Fully Confounded Settings Through Bounding Counterfactual Effects

Algorithmic recourse aims to provide actionable recommendations to indiv...
research
09/08/2023

Offline Recommender System Evaluation under Unobserved Confounding

Off-Policy Estimation (OPE) methods allow us to learn and evaluate decis...
research
06/01/2023

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

A prominent challenge of offline reinforcement learning (RL) is the issu...
research
12/23/2020

Identification of Unexpected Decisions in Partially Observable Monte-Carlo Planning: a Rule-Based Approach

Partially Observable Monte-Carlo Planning (POMCP) is a powerful online a...

Please sign up or login with your details

Forgot password? Click here to reset