Episodic Memory for Learning Subjective-Timescale Models

10/03/2020
by   Alexey Zakharov, et al.
14

In model-based learning, an agent's model is commonly defined over transitions between consecutive states of an environment even though planning often requires reasoning over multi-step timescales, with intermediate states either unnecessary, or worse, accumulating prediction error. In contrast, intelligent behaviour in biological organisms is characterised by the ability to plan over varying temporal scales depending on the context. Inspired by the recent works on human time perception, we devise a novel approach to learning a transition dynamics model, based on the sequences of episodic memories that define the agent's subjective timescale - over which it learns world dynamics and over which future planning is performed. We implement this in the framework of active inference and demonstrate that the resulting subjective-timescale model (STM) can systematically vary the temporal extent of its predictions while preserving the same computational efficiency. Additionally, we show that STM predictions are more likely to introduce future salient events (for example new objects coming into view), incentivising exploration of new areas of the environment. As a result, STM produces more informative action-conditioned roll-outs that assist the agent in making better decisions. We validate significant improvement in our STM agent's performance in the Animal-AI environment against a baseline system, trained using the environment's objective-timescale dynamics.

READ FULL TEXT

page 8

page 17

page 18

page 20

page 21

page 22

research
03/22/2019

DQN with model-based exploration: efficient learning on environments with sparse rewards

We propose Deep Q-Networks (DQN) with model-based exploration, an algori...
research
10/24/2020

Planning with Exploration: Addressing Dynamics Bottleneck in Model-based Reinforcement Learning

Model-based reinforcement learning is a framework in which an agent lear...
research
12/28/2018

Dynamic Planning Networks

We introduce Dynamic Planning Networks (DPN), a novel architecture for d...
research
12/19/2016

Self-Correcting Models for Model-Based Reinforcement Learning

When an agent cannot represent a perfectly accurate model of its environ...
research
12/10/2018

Improving Model-Based Control and Active Exploration with Reconstruction Uncertainty Optimization

Model based predictions of future trajectories of a dynamical system oft...
research
11/09/2019

Robo-PlaNet: Learning to Poke in a Day

Recently, the Deep Planning Network (PlaNet) approach was introduced as ...
research
03/14/2019

Incremental Learning of Discrete Planning Domains from Continuous Perceptions

We propose a framework for learning discrete deterministic planning doma...

Please sign up or login with your details

Forgot password? Click here to reset