Temporal Regularization in Markov Decision Process

11/01/2018
by   Pierre Thodoroff, et al.
0

Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.

READ FULL TEXT
research
03/05/2020

Distributional Robustness and Regularization in Reinforcement Learning

Distributionally Robust Optimization (DRO) has enabled to prove the equi...
research
09/19/2017

Sparse Markov Decision Processes with Causal Sparse Tsallis Entropy Regularization for Reinforcement Learning

In this paper, a sparse Markov decision process (MDP) with novel causal ...
research
09/16/2021

Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning

In batch reinforcement learning, there can be poorly explored state-acti...
research
05/03/2023

Human Machine Co-adaption Interface via Cooperation Markov Decision Process System

This paper aims to develop a new human-machine interface to improve reha...
research
05/14/2019

Control Regularization for Reduced Variance Reinforcement Learning

Dealing with high variance is a significant challenge in model-free rein...
research
07/04/2020

Discount Factor as a Regularizer in Reinforcement Learning

Specifying a Reinforcement Learning (RL) task involves choosing a suitab...
research
06/09/2022

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

In real-world reinforcement learning applications the learner's observat...

Please sign up or login with your details

Forgot password? Click here to reset