Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

06/12/2021
by   Shai Keynan, et al.
3

The Reinforcement Learning (RL) building blocks, i.e. Q-functions and policy networks, usually take elements from the cartesian product of two domains as input. In particular, the input of the Q-function is both the state and the action, and in multi-task problems (Meta-RL) the policy can take a state and a context. Standard architectures tend to ignore these variables' underlying interpretations and simply concatenate their features into a single vector. In this work, we argue that this choice may lead to poor gradient estimation in actor-critic algorithms and high variance learning steps in Meta-RL algorithms. To consider the interaction between the input variables, we suggest using a Hypernetwork architecture where a primary network determines the weights of a conditional dynamic network. We show that this approach improves the gradient approximation and reduces the learning step variance, which both accelerates learning and improves the final performance. We demonstrate a consistent improvement across different locomotion tasks and different algorithms both in RL (TD3 and SAC) and in Meta-RL (MAML and PEARL).

READ FULL TEXT

page 8

page 16

page 19

page 22

page 25

page 26

page 27

research
10/03/2018

Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem

Designing optimal controllers continues to be challenging as systems are...
research
06/16/2021

Offline RL Without Off-Policy Evaluation

Most prior approaches to offline reinforcement learning (RL) have taken ...
research
07/23/2018

Learning to Play Pong using Policy Gradient Learning

Activities in reinforcement learning (RL) revolve around learning the Ma...
research
08/19/2021

Prior Is All You Need to Improve the Robustness and Safety for the First Time Deployment of Meta RL

The field of Meta Reinforcement Learning (Meta-RL) has seen substantial ...
research
10/30/2021

Context Meta-Reinforcement Learning via Neuromodulation

Meta-reinforcement learning (meta-RL) algorithms enable agents to adapt ...
research
02/25/2021

CPG-ACTOR: Reinforcement Learning for Central Pattern Generators

Central Pattern Generators (CPGs) have several properties desirable for ...
research
06/02/2021

Towards Deeper Deep Reinforcement Learning

In computer vision and natural language processing, innovations in model...

Please sign up or login with your details

Forgot password? Click here to reset