On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

11/30/2015
by   Juergen Schmidhuber, et al.
0

This paper addresses the general problem of reinforcement learning (RL) in partially observable environments. In 2013, our large RL recurrent neural networks (RNNs) learned from scratch to drive simulated cars from high-dimensional video input. However, real brains are more powerful in many ways. In particular, they learn a predictive model of their initially unknown environment, and somehow use it for abstract (e.g., hierarchical) planning and reasoning. Guided by algorithmic information theory, we describe RNN-based AIs (RNNAIs) designed to do the same. Such an RNNAI can be trained on never-ending sequences of tasks, some of them provided by the user, others invented by the RNNAI itself in a curious, playful fashion, to improve its RNN-based world model. Unlike our previous model-building RNN-based RL machines dating back to 1990, the RNNAI learns to actively query its model for abstract reasoning and planning and decision making, essentially "learning to think." The basic ideas of this report can be applied to many other cases where one RNN-like system exploits the algorithmic information content of another. They are taken from a grant proposal submitted in Fall 2014, and also explain concepts such as "mirror neurons." Experimental results will be described in separate papers.

READ FULL TEXT
research
01/29/2019

Emergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN

Although recurrent neural networks (RNNs) for reinforcement learning (RL...
research
04/29/2021

What is Going on Inside Recurrent Meta Reinforcement Learning Agents?

Recurrent meta reinforcement learning (meta-RL) agents are agents that e...
research
11/09/2016

RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning

Deep reinforcement learning (deep RL) has been successful in learning so...
research
07/07/2022

gym-DSSAT: a crop model turned into a Reinforcement Learning environment

Addressing a real world sequential decision problem with Reinforcement L...
research
11/09/2016

Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control

This paper proposes a general method for improving the structure and qua...
research
11/29/2022

Symmetry Detection in Trajectory Data for More Meaningful Reinforcement Learning Representations

Knowledge of the symmetries of reinforcement learning (RL) systems can b...
research
08/06/2020

A Gentle Lecture Note on Filtrations in Reinforcement Learning

This note aims to provide a basic intuition on the concept of filtration...

Please sign up or login with your details

Forgot password? Click here to reset