Examining average and discounted reward optimality criteria in reinforcement learning

07/03/2021
by   Vektor Dewanto, et al.
19

In reinforcement learning (RL), the goal is to obtain an optimal policy, for which the optimality criterion is fundamentally important. Two major optimality criteria are average and discounted rewards, where the later is typically considered as an approximation to the former. While the discounted reward is more popular, it is problematic to apply in environments that have no natural notion of discounting. This motivates us to revisit a) the progression of optimality criteria in dynamic programming, b) justification for and complication of an artificial discount factor, and c) benefits of directly maximizing the average reward. Our contributions include a thorough examination of the relationship between average and discounted rewards, as well as a discussion of their pros and cons in RL. We emphasize that average-reward RL methods possess the ingredient and mechanism for developing the general discounting-free optimality criterion (Veinott, 1969) in RL.

READ FULL TEXT
research
08/18/2020

Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards

As the operations of autonomous systems generally affect simultaneously ...
research
10/18/2020

Average-reward model-free reinforcement learning: a systematic review and literature mapping

Model-free reinforcement learning (RL) has been an active area of resear...
research
06/14/2021

On-Policy Deep Reinforcement Learning for the Average-Reward Criterion

We develop theory and algorithms for average-reward on-policy Reinforcem...
research
06/30/2021

Reinforcement Learning based Disease Progression Model for Alzheimer's Disease

We model Alzheimer's disease (AD) progression by combining differential ...
research
04/26/2023

Multi-criteria Hardware Trojan Detection: A Reinforcement Learning Approach

Hardware Trojans (HTs) are undesired design or manufacturing modificatio...
research
07/19/2023

Benchmarking Potential Based Rewards for Learning Humanoid Locomotion

The main challenge in developing effective reinforcement learning (RL) p...
research
02/11/2022

Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems

In the long term, reinforcement learning (RL) is considered by many AI t...

Please sign up or login with your details

Forgot password? Click here to reset