A Unifying View of Optimism in Episodic Reinforcement Learning

07/03/2020
by   Gergely Neu, et al.
0

The principle of optimism in the face of uncertainty underpins many theoretically successful reinforcement learning algorithms. In this paper we provide a general framework for designing, analyzing and implementing such algorithms in the episodic reinforcement learning problem. This framework is built upon Lagrangian duality, and demonstrates that every model-optimistic algorithm that constructs an optimistic MDP has an equivalent representation as a value-optimistic dynamic programming algorithm. Typically, it was thought that these two classes of algorithms were distinct, with model-optimistic algorithms benefiting from a cleaner probabilistic analysis while value-optimistic algorithms are easier to implement and thus more practical. With the framework developed in this paper, we show that it is possible to get the best of both worlds by providing a class of algorithms which have a computationally efficient dynamic-programming implementation and also a simple probabilistic analysis. Besides being able to capture many existing algorithms in the tabular setting, our framework can also address largescale problems under realizable function approximation, where it enables a simple model-based analysis of some recently proposed methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2017

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

Approximate dynamic programming algorithms, such as approximate value it...
research
08/03/2022

Contractivity of Bellman Operator in Risk Averse Dynamic Programming with Infinite Horizon

The paper deals with a risk averse dynamic programming problem with infi...
research
07/20/2020

Lagrangian Duality in Reinforcement Learning

Although duality is used extensively in certain fields, such as supervis...
research
02/08/2020

Inferential Induction: Joint Bayesian Estimation of MDPs and Value Functions

Bayesian reinforcement learning (BRL) offers a decision-theoretic soluti...
research
01/08/2021

Evolving Reinforcement Learning Algorithms

We propose a method for meta-learning reinforcement learning algorithms ...
research
10/26/2020

Deep reinforced learning enables solving discrete-choice life cycle models to analyze social security reforms

Discrete-choice life cycle models can be used to, e.g., estimate how soc...

Please sign up or login with your details

Forgot password? Click here to reset