Deep Residual Reinforcement Learning

05/03/2019
by   Shangtong Zhang, et al.
38

We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in the DeepMind Control Suite benchmark. Moreover, we find the residual algorithm an effective approach to the distribution mismatch problem in model-based planning. Compared with the existing TD(k) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.

READ FULL TEXT
research
12/15/2018

Residual Policy Learning

We present Residual Policy Learning (RPL): a simple method for improving...
research
11/16/2022

Model Based Residual Policy Learning with Applications to Antenna Control

Non-differentiable controllers and rule-based policies are widely used f...
research
03/14/2021

Progressive residual learning for single image dehazing

The recent physical model-free dehazing methods have achieved state-of-t...
research
04/01/2021

Residual Model Learning for Microrobot Control

A majority of microrobots are constructed using compliant materials that...
research
12/02/2021

Residual Pathway Priors for Soft Equivariance Constraints

There is often a trade-off between building deep learning systems that a...
research
02/06/2021

Improving Model and Search for Computer Go

The standard for Deep Reinforcement Learning in games, following Alpha Z...

Please sign up or login with your details

Forgot password? Click here to reset