Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play

11/30/2018
by   Reza Mahjourian, et al.
0

Training robots with physical bodies requires developing new methods and action representations that allow the learning agents to explore the space of policies efficiently. This work studies sample-efficient learning of complex policies in the context of robot table tennis. It incorporates learning into a hierarchical control framework using a model-free strategy layer (which requires complex reasoning about opponents that is difficult to do in a model-based way), model-based prediction of external objects (which are difficult to control directly with analytic control methods, but governed by learnable and relatively simple laws of physics), and analytic controllers for the robot itself. Human demonstrations are used to train dynamics models, which together with the analytic controller allow any robot that is physically capable to play table tennis without training episodes. Self-play is used to train cooperative and adversarial strategies on top of model-based striking skills trained from human demonstrations. After only about 24000 strikes in self-play the agent learns to best exploit the human dynamics models for longer cooperative games. Further experiments demonstrate that model-free variants of the policy can discover new strikes not demonstrated by humans and achieve higher performance at the expense of lower sample-efficiency. Experiments are carried out in a virtual reality environment using sensory observations that are obtainable in the real world. The high sample-efficiency demonstrated in the evaluations show that the proposed method is suitable for learning directly on physical robots without transfer of models or policies from simulation.

READ FULL TEXT

page 3

page 11

page 13

research
03/01/2022

Affordance Learning from Play for Sample-Efficient Policy Learning

Robots operating in human-centered environments should have the ability ...
research
06/11/2020

Learning to Play by Imitating Humans

Acquiring multiple skills has commonly involved collecting a large numbe...
research
09/16/2023

Stylized Table Tennis Robots Skill Learning with Incomplete Human Demonstrations

In recent years, Reinforcement Learning (RL) is becoming a popular techn...
research
03/31/2020

Robotic Table Tennis with Model-Free Reinforcement Learning

We propose a model-free algorithm for learning efficient policies capabl...
research
04/24/2023

Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning

Learning algorithms, like Quality-Diversity (QD), can be used to acquire...
research
08/27/2019

A Data-Efficient Deep Learning Approach for Deployable Multimodal Social Robots

The deep supervised and reinforcement learning paradigms (among others) ...
research
02/28/2020

Introducing a Human-like Planner for Reaching in Cluttered Environments

Humans, in comparison to robots, are remarkably adept at reaching for ob...

Please sign up or login with your details

Forgot password? Click here to reset