Log In Sign Up

Causally Correct Partial Models for Reinforcement Learning

by   Danilo J. Rezende, et al.

In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don't model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, yet remain fast because they do not need to fully model future observations.


page 8

page 26

page 27

page 28


Learning Partially Observable Deterministic Action Models

We present exact algorithms for identifying deterministic-actions effect...

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

We propose to directly map raw visual observations and text input to act...

Partial-Order, Partially-Seen Observations of Fluents or Actions for Plan Recognition as Planning

This work aims to make plan recognition as planning more ready for real-...

PAC Reinforcement Learning with Rich Observations

We propose and study a new model for reinforcement learning with rich ob...

Control of Memory, Active Perception, and Action in Minecraft

In this paper, we introduce a new set of reinforcement learning (RL) tas...

Recurrent Environment Simulators

Models that can simulate how environments change in response to actions ...

Influence-aware Memory for Deep Reinforcement Learning

Making the right decisions when some of the state variables are hidden, ...