Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie

03/22/2019
by   Zhaoming Xie, et al.
0

Deep reinforcement learning (DRL) is a promising approach for developing legged locomotion skills. However, the iterative design process that is inevitable in practice is poorly supported by the default methodology. It is difficult to predict the outcomes of changes made to the reward functions, policy architectures, and the set of tasks being trained on. In this paper, we propose a practical method that allows the reward function to be fully redefined on each successive design iteration while limiting the deviation from the previous iteration. We characterize policies via sets of Deterministic Action Stochastic State (DASS) tuples, which represent the deterministic policy state-action pairs as sampled from the states visited by the trained stochastic policy. New policies are trained using a policy gradient algorithm which then mixes RL-based policy gradients with gradient updates defined by the DASS tuples. The tuples also allow for robust policy distillation to new network architectures. We demonstrate the effectiveness of this iterative-design approach on the bipedal robot Cassie, achieving stable walking with different gait styles at various speeds. We demonstrate the successful transfer of policies learned in simulation to the physical robot without any dynamics randomization, and that variable-speed walking policies for the physical robot can be represented by a small dataset of 5-10k tuples.

READ FULL TEXT

page 1

page 8

research
05/13/2019

Learning Novel Policies For Tasks

In this work, we present a reinforcement learning algorithm that can fin...
research
07/16/2018

Bipedal Walking Robot using Deep Deterministic Policy Gradient

Machine learning algorithms have found several applications in the field...
research
12/13/2021

Teaching a Robot to Walk Using Reinforcement Learning

Classical control techniques such as PID and LQR have been used effectiv...
research
07/05/2021

Hybrid and dynamic policy gradient optimization for bipedal robot locomotion

Controlling a non-statically bipedal robot is challenging due to the com...
research
03/29/2021

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

In this paper, a hierarchical and robust framework for learning bipedal ...
research
01/19/2021

Meta-Reinforcement Learning for Adaptive Motor Control in Changing Robot Dynamics and Environments

This work developed a meta-learning approach that adapts the control pol...
research
10/22/2019

Learning Humanoid Robot Running Skills through Proximal Policy Optimization

In the current level of evolution of Soccer 3D, motion control is a key ...

Please sign up or login with your details

Forgot password? Click here to reset