Internally Rewarded Reinforcement Learning

02/01/2023
by   Mengdi Li, et al.
0

We study a class of reinforcement learning problems where the reward signals for policy learning are generated by a discriminator that is dependent on and jointly optimized with the policy. This interdependence between the policy and the discriminator leads to an unstable learning process because reward signals from an immature discriminator are noisy and impede policy learning, and conversely, an untrained policy impedes discriminator learning. We call this learning setting Internally Rewarded Reinforcement Learning (IRRL) as the reward is not provided directly by the environment but internally by the discriminator. In this paper, we formally formulate IRRL and present a class of problems that belong to IRRL. We theoretically derive and empirically analyze the effect of the reward function in IRRL and based on these analyses propose the clipped linear reward function. Experimental results show that the proposed reward function can consistently stabilize the training process by reducing the impact of reward noise, which leads to faster convergence and higher performance compared with baselines in diverse tasks.

READ FULL TEXT

page 5

page 22

page 23

page 24

research
10/21/2021

Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain

We propose an approach for inverse reinforcement learning from hetero-do...
research
09/10/2021

Potential-based Reward Shaping in Sokoban

Learning to solve sparse-reward reinforcement learning problems is diffi...
research
09/09/2021

OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching

Inverse Reinforcement Learning (IRL) is attractive in scenarios where re...
research
07/07/2017

Emergence of Locomotion Behaviours in Rich Environments

The reinforcement learning paradigm allows, in principle, for complex be...
research
03/06/2020

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

Portfolio Selection is an important real-world financial task and has at...
research
12/11/2021

Control-Tutored Reinforcement Learning: Towards the Integration of Data-Driven and Model-Based Control

We present an architecture where a feedback controller derived on an app...
research
09/25/2020

Deep Reinforcement Learning with Stage Incentive Mechanism for Robotic Trajectory Planning

To improve the efficiency of deep reinforcement learning (DRL) based met...

Please sign up or login with your details

Forgot password? Click here to reset