Learning to Win, Lose and Cooperate through Reward Signal Evolution

05/17/2021
by   Rafał Muszyński, et al.
0

Solving a reinforcement learning problem typically involves correctly prespecifying the reward signal from which the algorithm learns. Here, we approach the problem of reward signal design by using an evolutionary approach to perform a search on the space of all possible reward signals. We introduce a general framework for optimizing N goals given n reward signals. Through experiments we demonstrate that such an approach allows agents to learn high-level goals - such as winning, losing and cooperating - from scratch without prespecified reward signals in the game of Pong. Some of the solutions found by the algorithm are surprising, in the sense that they would probably not have been chosen by a person trying to hand-code a given behaviour through a specific reward signal. Furthermore, it seems that the proposed approach may also benefit from higher stability of the training performance when compared with the typical score-based reward signals.

READ FULL TEXT
research
03/05/2020

Reward Design in Cooperative Multi-agent Reinforcement Learning for Packet Routing

In cooperative multi-agent reinforcement learning (MARL), how to design ...
research
12/20/2022

Settling the Reward Hypothesis

The reward hypothesis posits that, "all of what we mean by goals and pur...
research
11/01/2019

Positive-Unlabeled Reward Learning

Learning reward functions from data is a promising path towards achievin...
research
11/26/2021

Learning Long-Term Reward Redistribution via Randomized Return Decomposition

Many practical applications of reinforcement learning require agents to ...
research
02/24/2022

Learning Transferable Reward for Query Object Localization with Policy Adaptation

We propose a reinforcement learning based approach to query object local...
research
04/27/2020

Evolutionary Stochastic Policy Distillation

Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging ...
research
06/14/2020

Tackling Morpion Solitaire with AlphaZero-likeRanked Reward Reinforcement Learning

Morpion Solitaire is a popular single player game, performed with paper ...

Please sign up or login with your details

Forgot password? Click here to reset