UGAE: A Novel Approach to Non-exponential Discounting

02/11/2023
by   Ariel Kwiatkowski, et al.
0

The discounting mechanism in Reinforcement Learning determines the relative importance of future and present rewards. While exponential discounting is widely used in practice, non-exponential discounting methods that align with human behavior are often desirable for creating human-like agents. However, non-exponential discounting methods cannot be directly applied in modern on-policy actor-critic algorithms. To address this issue, we propose Universal Generalized Advantage Estimation (UGAE), which allows for the computation of GAE advantage values with arbitrary discounting. Additionally, we introduce Beta-weighted discounting, a continuous interpolation between exponential and hyperbolic discounting, to increase flexibility in choosing a discounting method. To showcase the utility of UGAE, we provide an analysis of the properties of various discounting methods. We also show experimentally that agents with non-exponential discounting trained via UGAE outperform variants trained with Monte Carlo advantage estimation. Through analysis of various discounting methods and experiments, we demonstrate the superior performance of UGAE with Beta-weighted discounting over the Monte Carlo baseline on standard RL benchmarks. UGAE is simple and easily integrated into any advantage-based algorithm as a replacement for the standard recursive GAE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2022

Risk-Sensitive Reinforcement Learning with Exponential Criteria

While risk-neutral reinforcement learning has shown experimental success...
research
01/28/2023

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

Many existing reinforcement learning (RL) methods employ stochastic grad...
research
10/14/2022

Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

Providing densely shaped reward functions for RL algorithms is often exc...
research
04/19/2023

Weak Convergence Of Tamed Exponential Integrators for Stochastic Differential Equations

We prove weak convergence of order one for a class of exponential based ...
research
11/13/2020

Scaffolding Reflection in Reinforcement Learning Framework for Confinement Escape Problem

This paper formulates an application of reinforcement learning for an ev...
research
11/30/2018

Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Deep reinforcement learning (DRL) has achieved great successes in recent...
research
04/20/2023

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

Effective offline RL methods require properly handling out-of-distributi...

Please sign up or login with your details

Forgot password? Click here to reset