Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

09/26/2022
by   Firas Jarboui, et al.
14

The endeavor of artificial intelligence (AI) is to design autonomous agents capable of achieving complex tasks. Namely, reinforcement learning (RL) proposes a theoretical background to learn optimal behaviors. In practice, RL algorithms rely on geometric discounts to evaluate this optimality. Unfortunately, this does not cover decision processes where future returns are not exponentially less valuable. Depending on the problem, this limitation induces sample-inefficiency (as feed-backs are exponentially decayed) and requires additional curricula/exploration mechanisms (to deal with sparse, deceptive or adversarial rewards). In this paper, we tackle these issues by generalizing the discounted problem formulation with a family of delayed objective functions. We investigate the underlying RL problem to derive: 1) the optimal stationary solution and 2) an approximation of the optimal non-stationary control. The devised algorithms solved hard exploration problems on tabular environment and improved sample-efficiency on classic simulated robotics benchmarks.

READ FULL TEXT

page 2

page 7

page 13

page 14

research
05/03/2017

Answer Set Programming for Non-Stationary Markov Decision Processes

Non-stationary domains, where unforeseen changes happen, present a chall...
research
05/10/2019

Reinforcement Learning in Non-Stationary Environments

Reinforcement learning (RL) methods learn optimal decisions in the prese...
research
11/19/2022

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

We study risk-sensitive reinforcement learning (RL) based on an entropic...
research
05/21/2018

Hierarchical Reinforcement Learning with Hindsight

Reinforcement Learning (RL) algorithms can suffer from poor sample effic...
research
01/06/2021

Geometric Entropic Exploration

Exploration is essential for solving complex Reinforcement Learning (RL)...
research
11/15/2022

General Intelligence Requires Rethinking Exploration

We are at the cusp of a transition from "learning from data" to "learnin...
research
03/26/2020

A Flexible Job Shop Scheduling Representation of the Autonomous In-Space Assembly Task Assignment Problem

As in-space exploration increases, autonomous systems will play a vital ...

Please sign up or login with your details

Forgot password? Click here to reset