Sim-to-Real Transfer for Biped Locomotion

03/04/2019
by   Wenhao Yu, et al.
0

We present a new approach for transfer of dynamic robot control policies such as biped locomotion from simulation to real hardware. Key to our approach is to perform system identification of the model parameters μ of the hardware (e.g. friction, center-of-mass) in two distinct stages, before policy learning (pre-sysID) and after policy learning (post-sysID). Pre-sysID begins by collecting trajectories from the physical hardware based on a set of generic motion sequences. Because the trajectories may not be related to the task of interest, presysID does not attempt to accurately identify the true value of μ, but only to approximate the range of μ to guide the policy learning. Next, a Projected Universal Policy (PUP) is created by simultaneously training a network that projects μ to a low-dimensional latent variable η and a family of policies that are conditioned on η. The second round of system identification (post-sysID) is then carried out by deploying the PUP on the robot hardware using task-relevant trajectories. We use Bayesian Optimization to determine the values for η that optimizes the performance of PUP on the real hardware. We have used this approach to create three successful biped locomotion controllers (walk forward, walk backwards, walk sideways) on the Darwin OP2 robot.

READ FULL TEXT

page 1

page 5

page 6

research
05/07/2018

Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots

Learning for control can acquire controllers for novel robotic tasks, pa...
research
04/17/2020

Goal-conditioned Batch Reinforcement Learning for Rotation Invariant Locomotion

We propose a novel approach to learn goal-conditioned policies for locom...
research
04/08/2022

Sim-to-Real Learning of Robust Compliant Bipedal Locomotion on Torque Sensor-Less Gear-Driven Humanoid

In deep reinforcement learning, sim-to-real is the mainstream method as ...
research
04/12/2018

Efficient Model Identification for Tensegrity Locomotion

This paper aims to identify in a practical manner unknown physical param...
research
12/06/2022

Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior

Learned locomotion policies can rapidly adapt to diverse environments si...
research
03/10/2019

Affordance Learning for End-to-End Visuomotor Robot Control

Training end-to-end deep robot policies requires a lot of domain-, task-...
research
10/30/2020

Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach

In this paper, with a view toward fast deployment of locomotion gaits in...

Please sign up or login with your details

Forgot password? Click here to reset