Causally Correct Partial Models for Reinforcement Learning

02/07/2020 ∙ by Danilo J. Rezende, et al. ∙ 17

In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don't model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, yet remain fast because they do not need to fully model future observations.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 26

page 27

page 28

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.