Learning Synthetic Environments for Reinforcement Learning with Evolution Strategies

01/24/2021
by   Fabio Ferreira, et al.
0

This work explores learning agent-agnostic synthetic environments (SEs) for Reinforcement Learning. SEs act as a proxy for target environments and allow agents to be trained more efficiently than when directly trained on the target environment. We formulate this as a bi-level optimization problem and represent an SE as a neural network. By using Natural Evolution Strategies and a population of SE parameter vectors, we train agents in the inner loop on evolving SEs while in the outer loop we use the performance on the target task as a score for meta-updating the SE population. We show empirically that our method is capable of learning SEs for two discrete-action-space tasks (CartPole-v0 and Acrobot-v1) that allow us to train agents more robustly and with up to 60 evaluations that the SEs are robust against hyperparameter changes such as the learning rate, batch sizes and network sizes, we also show that SEs trained with DDQN agents transfer in limited ways to a discrete-action-space version of TD3 and very well to Dueling DDQN.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2022

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

We introduce Synthetic Environments (SEs) and Reward Networks (RNs), rep...
research
01/15/2017

Agent-Agnostic Human-in-the-Loop Reinforcement Learning

Providing Reinforcement Learning agents with expert advice can dramatica...
research
09/26/2020

Lineage Evolution Reinforcement Learning

We propose a general agent population learning system, and on this basis...
research
09/27/2018

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

We propose a method to efficiently learn diverse strategies in reinforce...
research
02/01/2021

Meta-learning with negative learning rates

Deep learning models require a large amount of data to perform well. Whe...
research
03/07/2022

Reinforcement Learning for Location-Aware Scheduling

Recent techniques in dynamical scheduling and resource management have f...
research
06/17/2022

Fast Population-Based Reinforcement Learning on a Single Machine

Training populations of agents has demonstrated great promise in Reinfor...

Please sign up or login with your details

Forgot password? Click here to reset