Structured Control Nets for Deep Reinforcement Learning

02/22/2018
by   Mario Srouji, et al.
0

In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The proposed Structured Control Net (SCN) splits the generic MLP into two separate sub-modules: a nonlinear control module and a linear control module. Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control. We hypothesize that this will bring together the benefits of both linear and nonlinear policies: improve training sample efficiency, final episodic reward, and generalization of learned policy, while requiring a smaller network and being generally applicable to different training methods. We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom 2D urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods. The proposed architecture has the potential to improve upon broader control tasks by incorporating problem specific priors into the architecture. As a case study, we demonstrate much improved performance for locomotion tasks by emulating the biological central pattern generators (CPGs) as the nonlinear part of the architecture.

READ FULL TEXT
research
01/06/2019

Recurrent Control Nets for Deep Reinforcement Learning

Central Pattern Generators (CPGs) are biological neural circuits capable...
research
10/06/2021

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Recent work applying deep reinforcement learning (DRL) to solve travelin...
research
07/05/2021

Hybrid and dynamic policy gradient optimization for bipedal robot locomotion

Controlling a non-statically bipedal robot is challenging due to the com...
research
11/10/2017

Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarisation

Supervised approaches for text summarisation suffer from the problem of ...
research
11/25/2019

A Deep Reinforcement Learning Architecture for Multi-stage Optimal Control

Deep reinforcement learning for high dimensional, hierarchical control t...
research
09/23/2019

Constrained Attractor Selection Using Deep Reinforcement Learning

This paper describes an approach for attractor selection in nonlinear dy...
research
09/09/2021

Learning Vision-Guided Dynamic Locomotion Over Challenging Terrains

Legged robots are becoming increasingly powerful and popular in recent y...

Please sign up or login with your details

Forgot password? Click here to reset