Efficiently Learning Small Policies for Locomotion and Manipulation

09/30/2022
by   Shashank Hegde, et al.
0

Neural control of memory-constrained, agile robots requires small, yet highly performant models. We leverage graph hyper networks to learn graph hyper policies trained with off-policy reinforcement learning resulting in networks that are two orders of magnitude smaller than commonly used networks yet encode policies comparable to those encoded by much larger networks trained on the same task. We show that our method can be appended to any off-policy reinforcement learning algorithm, without any change in hyperparameters, by showing results across locomotion and manipulation tasks. Further, we obtain an array of working policies, with differing numbers of parameters, allowing us to pick an optimal network for the memory constraints of a system. Training multiple policies with our method is as sample efficient as training a single policy. Finally, we provide a method to select the best architecture, given a constraint on the number of parameters. Project website: https://sites.google.com/usc.edu/graphhyperpolicy

READ FULL TEXT

page 1

page 2

page 3

research
03/02/2023

Co-learning Planning and Control Policies Using Differentiable Formal Task Constraints

This paper presents a hierarchical reinforcement learning algorithm cons...
research
01/11/2022

Combining Learning-based Locomotion Policy with Model-based Manipulation for Legged Mobile Manipulators

Deep reinforcement learning produces robust locomotion policies for legg...
research
09/09/2015

Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the cont...
research
10/18/2022

Deep Whole-Body Control: Learning a Unified Policy for Manipulation and Locomotion

An attached arm can significantly increase the applicability of legged r...
research
05/28/2023

On the Value of Myopic Behavior in Policy Reuse

Leveraging learned strategies in unfamiliar scenarios is fundamental to ...
research
05/30/2023

Generating Behaviorally Diverse Policies with Latent Diffusion Models

Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has ...
research
10/19/2020

D2RL: Deep Dense Architectures in Reinforcement Learning

While improvements in deep learning architectures have played a crucial ...

Please sign up or login with your details

Forgot password? Click here to reset