An Empirical Study on Hyperparameters and their Interdependence for RL Generalization

06/02/2019
by   Xingyou Song, et al.
0

Recent results in Reinforcement Learning (RL) have shown that agents with limited training environments are susceptible to a large amount of overfitting across many domains. A key challenge for RL generalization is to quantitatively explain the effects of changing parameters on testing performance. Such parameters include architecture, regularization, and RL-dependent variables such as discount factor and action stochasticity. We provide empirical results that show complex and interdependent relationships between hyperparameters and generalization. We further show that several empirical metrics such as gradient cosine similarity and trajectory-dependent metrics serve to provide intuition towards these results.

READ FULL TEXT

page 3

page 4

page 7

page 9

page 10

page 11

page 12

page 13

research
12/21/2022

Hyperparameters in Contextual RL are Highly Situational

Although Reinforcement Learning (RL) has shown impressive results in gam...
research
04/18/2018

A Study on Overfitting in Deep Reinforcement Learning

Recent years have witnessed significant progresses in deep Reinforcement...
research
06/10/2020

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

In recent years, on-policy reinforcement learning (RL) has been successf...
research
08/03/2020

Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning

Despite the significant progress of deep reinforcement learning (RL) in ...
research
05/29/2019

On the Generalization Gap in Reparameterizable Reinforcement Learning

Understanding generalization in reinforcement learning (RL) is a signifi...
research
06/01/2021

Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics

To understand better the causes of good generalization performance in st...
research
10/28/2019

Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck

The ability for policies to generalize to new environments is key to the...

Please sign up or login with your details

Forgot password? Click here to reset