What is Going on Inside Recurrent Meta Reinforcement Learning Agents?

04/29/2021
by   Safa Alver, et al.
0

Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm". After being trained on a pre-specified task distribution, the learned weights of the agent's RNN are said to implement an efficient learning algorithm through their activity dynamics, which allows the agent to quickly solve new tasks sampled from the same distribution. However, due to the black-box nature of these agents, the way in which they work is not yet fully understood. In this study, we shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework. We hypothesize that the learned activity dynamics is acting as belief states for such agents. Several illustrative experiments suggest that this hypothesis is true, and that recurrent meta-RL agents can be viewed as agents that learn to act optimally in partially observable environments consisting of multiple related tasks. This view helps in understanding their failure cases and some interesting model-based results reported in the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2019

Meta reinforcement learning as task inference

Humans achieve efficient learning by relying on prior knowledge about th...
research
02/16/2018

Learning Implicit Communication Strategies for the Purpose of Illicit Collusion

Winner-take-all dynamics are prevalent throughout the human and natural ...
research
12/23/2019

Variational Recurrent Models for Solving Partially Observable Control Tasks

In partially observable (PO) environments, deep reinforcement learning (...
research
01/23/2021

BF++: a language for general-purpose program synthesis

Most state of the art decision systems based on Reinforcement Learning (...
research
11/09/2016

RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning

Deep reinforcement learning (deep RL) has been successful in learning so...
research
01/15/2023

Neuro-symbolic Meta Reinforcement Learning for Trading

We model short-duration (e.g. day) trading in financial markets as a seq...

Please sign up or login with your details

Forgot password? Click here to reset