Evolving Reinforcement Learning Algorithms

01/08/2021
by   John D. Co-Reyes, et al.
28

We propose a method for meta-learning reinforcement learning algorithms by searching over the space of computational graphs which compute the loss function for a value-based model-free RL agent to optimize. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. Our method can both learn from scratch and bootstrap off known existing algorithms, like DQN, enabling interpretable modifications which improve performance. Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm. Bootstrapped from DQN, we highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games. The analysis of the learned algorithm behavior shows resemblance to recently proposed RL algorithms that address overestimation in value-based methods.

READ FULL TEXT

page 7

page 13

page 14

research
06/12/2018

Unsupervised Meta-Learning for Reinforcement Learning

Meta-learning is a powerful tool that builds on multi-task learning to l...
research
07/17/2020

Discovering Reinforcement Learning Algorithms

Reinforcement learning (RL) algorithms update an agent's parameters acco...
research
09/04/2019

Learning sparse representations in reinforcement learning

Reinforcement learning (RL) algorithms allow artificial agents to improv...
research
06/12/2020

A Brief Look at Generalization in Visual Meta-Reinforcement Learning

Due to the realization that deep reinforcement learning algorithms train...
research
04/28/2020

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

We propose a simple data augmentation technique that can be applied to s...
research
08/07/2022

A Game-Theoretic Perspective of Generalization in Reinforcement Learning

Generalization in reinforcement learning (RL) is of importance for real ...
research
07/03/2020

A Unifying View of Optimism in Episodic Reinforcement Learning

The principle of optimism in the face of uncertainty underpins many theo...

Please sign up or login with your details

Forgot password? Click here to reset