Generative Adversarial Policy Networks for Behavioural Repertoire

11/07/2018
by   Marija Jegorova, et al.
0

Learning algorithms are enabling robots to solve increasingly challenging real-world tasks. These approaches often rely on demonstrations and reproduce the behavior shown. Unexpected changes in the environment may require using different behaviors to achieve the same effect, for instance to reach and grasp an object in changing clutter. An emerging paradigm addressing this robustness issue is to learn a diverse set of successful behaviors for a given task, from which a robot can select the most suitable policy when faced with a new environment. In this paper, we explore a novel realization of this vision by learning a generative model over policies. Rather than learning a single policy, or a small fixed repertoire, our generative model for policies compactly encodes an unbounded number of policies and allows novel controller variants to be sampled. Leveraging our generative policy network, a robot can sample novel behaviors until it finds one that works for a new environment. We demonstrate this idea with an application of robust ball-throwing in the presence of obstacles. We show that this approach achieves a greater diversity of behaviors than an existing evolutionary approach, while maintaining good efficacy of sampled behaviors, allowing a Baxter robot to hit targets more often when ball throwing in the presence of obstacles.

READ FULL TEXT
research
03/05/2018

Learning to Sequence Robot Behaviors for Visual Navigation

Recent literature in the robotics community has focused on learning robo...
research
07/15/2021

Adaptable Agent Populations via a Generative Model of Policies

In the natural world, life has found innumerable ways to survive and oft...
research
09/24/2022

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Learning from Demonstration (LfD) approaches empower end-users to teach ...
research
04/05/2018

Data-driven Policy Transfer with Imprecise Perception Simulation

The paper presents a complete pipeline for learning continuous motion co...
research
05/30/2023

Generating Behaviorally Diverse Policies with Latent Diffusion Models

Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has ...
research
05/13/2020

DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics

Robots are still limited to controlled conditions, that the robot design...
research
08/31/2023

A Policy Adaptation Method for Implicit Multitask Reinforcement Learning Problems

In dynamic motion generation tasks, including contact and collisions, sm...

Please sign up or login with your details

Forgot password? Click here to reset