Discounted Reinforcement Learning is Not an Optimization Problem

10/04/2019
by   Abhishek Naik, et al.
0

Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks. This is because it is not an optimization problem — it lacks an objective function. After substantiating these claims, we go on to address some misconceptions about discounting and its connection to the average reward formulation. We encourage researchers to adopt rigorous optimization approaches for reinforcement learning in continuing tasks, such as average reward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2017

Is Epicurus the father of Reinforcement Learning?

The Epicurean Philosophy is commonly thought as simplistic and hedonisti...
research
03/23/2023

Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems

The aim of this paper is to improve the understanding of the optimizatio...
research
05/14/2022

Cliff Diving: Exploring Reward Surfaces in Reinforcement Learning Environments

Visualizing optimization landscapes has led to many fundamental insights...
research
11/08/2020

Reinforcement Learning for Assignment problem

This paper is dedicated to the application of reinforcement learning com...
research
04/29/2020

Whittle index based Q-learning for restless bandits with average reward

A novel reinforcement learning algorithm is introduced for multiarmed re...
research
09/03/2018

A Minimum Discounted Reward Hamilton-Jacobi Formulation for Computing Reachable Sets

We propose a novel formulation for approximating reachable sets through ...
research
06/09/2022

Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk

Though deep reinforcement learning (DRL) has obtained substantial succes...

Please sign up or login with your details

Forgot password? Click here to reset