Learning Setup Policies: Reliable Transition Between Locomotion Behaviours

by   Brendan Tidd, et al.

Dynamic platforms that operate over manyunique terrain conditions typically require multiple controllers.To transition safely between controllers, there must be anoverlap of states between adjacent controllers. We developa novel method for training Setup Policies that bridge thetrajectories between pre-trained Deep Reinforcement Learning(DRL) policies. We demonstrate our method with a simulatedbiped traversing a difficult jump terrain, where a single policyfails to learn the task, and switching between pre-trainedpolicies without Setup Policies also fails. We perform anablation of key components of our system, and show thatour method outperforms others that learn transition policies.We demonstrate our method with several difficult and diverseterrain types, and show that we can use Setup Policies as partof a modular control suite to successfully traverse a sequence ofcomplex terrains. We show that using Setup Policies improvesthe success rate for traversing a single difficult jump terrain(from 1.5 asequence of various terrains (from 6.5



page 1

page 2

page 6


Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Legged robots often use separate control policies that are highly engine...

Training Transition Policies via Distribution Matching for Complex Tasks

Humans decompose novel complex tasks into simpler ones to exploit previo...

Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies

Researchers have demonstrated that Deep Reinforcement Learning (DRL) is ...

Near-optimal Deep Reinforcement Learning Policies from Data for Zone Temperature Control

Replacing poorly performing existing controllers with smarter solutions ...

Multi-Agent Deep Reinforcement Learning for Request Dispatching in Distributed-Controller Software-Defined Networking

Recently, distributed controller architectures have been quickly gaining...

DeepGait: Planning and Control of Quadrupedal Gaits using Deep Reinforcement Learning

This paper addresses the problem of legged locomotion in non-flat terrai...

Learning Stabilizing Control Policies for a Tensegrity Hopper with Augmented Random Search

In this paper, we consider tensegrity hopper - a novel tensegrity-based ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.