Efficient Deep Reinforcement Learning Requires Regulating Overfitting

04/20/2023
by   Qiyang Li, et al.
0

Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment. While many prior works have shown that proper regularization techniques are crucial for enabling data-efficient RL, a general understanding of the bottlenecks in data-efficient RL has remained unclear. Consequently, it has been difficult to devise a universal technique that works well across all domains. In this paper, we attempt to understand the primary bottleneck in sample-efficient deep RL by examining several potential hypotheses such as non-stationarity, excessive action distribution shift, and overfitting. We perform thorough empirical analysis on state-based DeepMind control suite (DMC) tasks in a controlled and systematic way to show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms, and prior methods that lead to good performance do in fact, control the validation TD error to be low. This observation gives us a robust principle for making deep RL efficient: we can hill-climb on the validation TD error by utilizing any form of regularization techniques from supervised learning. We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2018

Generalization and Regularization in DQN

Deep reinforcement learning (RL) algorithms have shown an impressive abi...
research
04/18/2018

A Study on Overfitting in Deep Reinforcement Learning

Recent years have witnessed significant progresses in deep Reinforcement...
research
11/17/2021

Self-Learning Tuning for Post-Silicon Validation

Increasing complexity of modern chips makes design validation more diffi...
research
03/21/2020

Deep Reinforcement Learning with Smooth Policy

Deep neural networks have been widely adopted in modern reinforcement le...
research
02/26/2019

Diagnosing Bottlenecks in Deep Q-learning Algorithms

Q-learning methods represent a commonly used class of algorithms in rein...
research
02/19/2019

Investigating Generalisation in Continuous Deep Reinforcement Learning

Deep Reinforcement Learning has shown great success in a variety of cont...
research
06/19/2023

Enhancing Generalization and Plasticity for Sample Efficient Reinforcement Learning

In Reinforcement Learning (RL), enhancing sample efficiency is crucial, ...

Please sign up or login with your details

Forgot password? Click here to reset