Acting upon Imagination: when to trust imagined trajectories in model based reinforcement learning

05/12/2021
by   Adrian Remonda, et al.
0

Model based reinforcement learning (MBRL) uses an imperfect model of the world to imagine trajectories of future states and plan the best actions to maximize a reward function. These trajectories are imperfect and MBRL attempts to overcome this by relying on model predictive control (MPC) to continuously re-imagine trajectories from scratch. Such re-generation of imagined trajectories carries the major computational cost and increasing complexity in tasks with longer receding horizon. This paper aims to investigate how far in the future the imagined trajectories can be relied upon while still maintaining acceptable reward. Firstly, an error analysis is presented for systematic skipping recalculations for varying number of consecutive steps. challenging benchmark control tasks. Secondly, we propose two methods offering when to trust and act upon imagined trajectories, looking at recent errors with respect to expectations, or comparing the confidence in an action imagined against its execution. Thirdly, we evaluate the effects of acting upon imagination while training the model of the world. Results show that acting upon imagination can reduce calculations by at least 20 depending on the environment, while retaining acceptable reward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2022

Value Summation: A Novel Scoring Function for MPC-based Model-based Reinforcement Learning

This paper proposes a novel scoring function for the planning module of ...
research
06/27/2023

Trajectory Generation, Control, and Safety with Denoising Diffusion Probabilistic Models

We present a framework for safety-critical optimal control of physical s...
research
12/12/2019

Improved Activity Forecasting for Generating Trajectories

An efficient inverse reinforcement learning for generating trajectories ...
research
11/16/2019

Generalized Maximum Causal Entropy for Inverse Reinforcement Learning

We consider the problem of learning from demonstrated trajectories with ...
research
11/01/2021

On the Expressivity of Markov Reward

Reward is the driving force for reinforcement-learning agents. This pape...
research
07/10/2019

Regularizing Neural Networks for Future Trajectory Prediction via Inverse Reinforcement Learning

Predicting distant future trajectories of agents in a dynamic scene is n...
research
09/02/2022

TarGF: Learning Target Gradient Field for Object Rearrangement

Object Rearrangement is to move objects from an initial state to a goal ...

Please sign up or login with your details

Forgot password? Click here to reset