Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

10/29/2019
by   C. Daniel Freeman, et al.
30

Much of model-based reinforcement learning involves learning a model of an agent's world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware—e.g., a brain—arose as the byproduct of competing evolutionary pressures for survival, not minimization of a supervised forward-predictive loss via gradient descent. That useful models can arise out of the messy and slow optimization process of evolution suggests that forward-predictive modeling can arise as a side-effect of optimization under the right circumstances. Crucially, this optimization process need not explicitly be a forward-predictive loss. In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep. In doing so, we can coerce an agent into learning a world model to fill in the observation gaps during reinforcement learning. We show that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment. Videos of our results available at https://learningtopredict.github.io/

READ FULL TEXT

page 7

page 8

page 16

page 17

research
04/04/2019

Self-Adapting Goals Allow Transfer of Predictive Models to New Tasks

A long-standing challenge in Reinforcement Learning is enabling agents t...
research
03/27/2018

World Models

We explore building generative neural network models of popular reinforc...
research
10/15/2022

PI-QT-Opt: Predictive Information Improves Multi-Task Robotic Reinforcement Learning at Scale

The predictive information, the mutual information between the past and ...
research
06/29/2020

Exploring Optimal Control With Observations at a Cost

There has been a current trend in reinforcement learning for healthcare ...
research
06/25/2021

Predictive Control Using Learned State Space Models via Rolling Horizon Evolution

A large part of the interest in model-based reinforcement learning deriv...
research
09/03/2018

Flatland: a Lightweight First-Person 2-D Environment for Reinforcement Learning

We propose Flatland, a simple, lightweight environment for fast prototyp...
research
03/13/2013

The Bounded Bayesian

The ideal Bayesian agent reasons from a global probability model, but re...

Please sign up or login with your details

Forgot password? Click here to reset