Transformers are Meta-Reinforcement Learners

06/14/2022
by   Luckeciano C. Melo, et al.
0

The transformer architecture and variants presented remarkable success across many machine learning tasks in recent years. This success is intrinsically related to the capability of handling long sequences and the presence of context-dependent weights from the attention mechanism. We argue that these capabilities suit the central role of a Meta-Reinforcement Learning algorithm. Indeed, a meta-RL agent needs to infer the task from a sequence of trajectories. Furthermore, it requires a fast adaptation strategy to adapt its policy for a new task – which can be achieved using the self-attention mechanism. In this work, we present TrMRL (Transformers for Meta-Reinforcement Learning), a meta-RL agent that mimics the memory reinstatement mechanism using the transformer architecture. It associates the recent past of working memories to build an episodic memory recursively through the transformer layers. We show that the self-attention computes a consensus representation that minimizes the Bayes Risk at each layer and provides meaningful features to compute the best actions. We conducted experiments in high-dimensional continuous control environments for locomotion and dexterous manipulation. Results show that TrMRL presents comparable or superior asymptotic performance, sample efficiency, and out-of-distribution generalization compared to the baselines in these environments.

READ FULL TEXT

page 14

page 17

page 18

research
08/15/2023

Attention Is Not All You Need Anymore

In recent years, the popular Transformer architecture has achieved great...
research
03/29/2022

Transformer Network-based Reinforcement Learning Method for Power Distribution Network (PDN) Optimization of High Bandwidth Memory (HBM)

In this article, for the first time, we propose a transformer network-ba...
research
07/09/2020

Attention or memory? Neurointerpretable agents in space and time

In neuroscience, attention has been shown to bidirectionally interact wi...
research
02/01/2022

Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers

Much of recent Deep Reinforcement Learning success is owed to the neural...
research
03/07/2023

Structured State Space Models for In-Context Reinforcement Learning

Structured state space sequence (S4) models have recently achieved state...
research
05/11/2021

Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments

At present, attention mechanism has been widely applied to the fields of...
research
06/30/2022

Deep Reinforcement Learning with Swin Transformer

Transformers are neural network models that utilize multiple layers of s...

Please sign up or login with your details

Forgot password? Click here to reset