Efficient decorrelation of features using Gramian in Reinforcement Learning

11/19/2019
by   Borislav Mavrin, et al.
0

Learning good representations is a long standing problem in reinforcement learning (RL). One of the conventional ways to achieve this goal in the supervised setting is through regularization of the parameters. Extending some of these ideas to the RL setting has not yielded similar improvements in learning. In this paper, we develop an online regularization framework for decorrelating features in RL and demonstrate its utility in several test environments. We prove that the proposed algorithm converges in the linear function approximation setting and does not change the main objective of maximizing cumulative reward. We demonstrate how to scale the approach to deep RL using the Gramian of the features achieving linear computational complexity in the number of features and squared complexity in size of the batch. We conduct an extensive empirical study of the new approach on Atari 2600 games and show a significant improvement in sample efficiency in 40 out of 49 games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2021

A Simple Reward-free Approach to Constrained Reinforcement Learning

In constrained reinforcement learning (RL), a learning agent seeks to no...
research
10/03/2022

Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation

We study the problem of deployment efficient reinforcement learning (RL)...
research
11/30/2022

Targets in Reinforcement Learning to solve Stackelberg Security Games

Reinforcement Learning (RL) algorithms have been successfully applied to...
research
06/27/2012

Greedy Algorithms for Sparse Reinforcement Learning

Feature selection and regularization are becoming increasingly prominent...
research
06/02/2023

Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

We consider the reinforcement learning (RL) problem with general utiliti...
research
03/30/2020

Agent57: Outperforming the Atari Human Benchmark

Atari games have been a long-standing benchmark in the reinforcement lea...
research
08/11/2020

Batch Value-function Approximation with Only Realizability

We solve a long-standing problem in batch reinforcement learning (RL): l...

Please sign up or login with your details

Forgot password? Click here to reset