Operator Augmentation for Model-based Policy Evaluation

10/25/2021
by   Xun Tang, et al.
13

In model-based reinforcement learning, the transition matrix and reward vector are often estimated from random samples subject to noise. Even if the estimated model is an unbiased estimate of the true underlying model, the value function computed from the estimated model is biased. We introduce an operator augmentation method for reducing the error introduced by the estimated model. When the error is in the residual norm, we prove that the augmentation factor is always positive and upper bounded by 1 + O (1/n), where n is the number of samples used in learning each row of the transition matrix. We also propose a practical numerical algorithm for implementing the operator augmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2021

Operator Augmentation for General Noisy Matrix Systems

In the computational sciences, one must often estimate model parameters ...
research
04/19/2018

Lipschitz Continuity in Model-based Reinforcement Learning

Model-based reinforcement-learning methods learn transition and reward m...
research
10/19/2020

Operator Augmentation for Noisy Elliptic Systems

In the computational sciences, one must often estimate model parameters ...
research
11/25/2022

Operator Splitting Value Iteration

We introduce new planning and reinforcement learning algorithms for disc...
research
03/09/2020

Transfer Reinforcement Learning under Unobserved Contextual Information

In this paper, we study a transfer reinforcement learning problem where ...
research
02/07/2021

Model-Augmented Q-learning

In recent years, Q-learning has become indispensable for model-free rein...
research
09/28/2019

Practical shift choice in the shift-and-invert Krylov subspace evaluations of the matrix exponential

We propose two methods to find a proper shift parameter in the shift-and...

Please sign up or login with your details

Forgot password? Click here to reset