Learning Synthetic Environments and Reward Networks for Reinforcement Learning

02/06/2022
by   Fabio Ferreira, et al.
0

We introduce Synthetic Environments (SEs) and Reward Networks (RNs), represented by neural networks, as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment. While an SE acts as a full proxy to a real environment by learning about its state dynamics and rewards, an RN is a partial proxy that learns to augment or replace rewards. We use bi-level optimization to evolve SEs and RNs: the inner loop trains the RL agent, and the outer loop trains the parameters of the SE / RN via an evolution strategy. We evaluate our proposed new concept on a broad range of RL algorithms and classic control environments. In a one-to-one comparison, learning an SE proxy requires more interactions with the real environment than training agents only on the real environment. However, once such an SE has been learned, we do not need any interactions with the real environment to train new agents. Moreover, the learned SE proxies allow us to train agents with fewer interactions while maintaining the original task performance. Our empirical results suggest that SEs achieve this result by learning informed representations that bias the agents towards relevant states. Moreover, we find that these proxies are robust against hyperparameter variation and can also transfer to unseen agents.

READ FULL TEXT

page 16

page 21

research
01/24/2021

Learning Synthetic Environments for Reinforcement Learning with Evolution Strategies

This work explores learning agent-agnostic synthetic environments (SEs) ...
research
05/18/2019

Evolving Rewards to Automate Reinforcement Learning

Many continuous control tasks have easily formulated objectives, yet usi...
research
11/24/2020

Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning

Reinforcement Learning (RL) is an area of machine learning concerned wit...
research
04/25/2023

Loss and Reward Weighing for increased learning in Distributed Reinforcement Learning

This paper introduces two learning schemes for distributed agents in Rei...
research
12/09/2019

Adversarial recovery of agent rewards from latent spaces of the limit order book

Inverse reinforcement learning has proved its ability to explain state-a...
research
12/11/2019

SMiRL: Surprise Minimizing RL in Dynamic Environments

All living organisms struggle against the forces of nature to carve out ...
research
05/26/2022

Reinforcement Learning Approach for Mapping Applications to Dataflow-Based Coarse-Grained Reconfigurable Array

The Streaming Engine (SE) is a Coarse-Grained Reconfigurable Array which...

Please sign up or login with your details

Forgot password? Click here to reset