Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup

02/13/2019
by   Devin Schwab, et al.
2

We present a method for fast training of vision based control policies on real robots. The key idea behind our method is to perform multi-task Reinforcement Learning with auxiliary tasks that differ not only in the reward to be optimized but also in the state-space in which they operate. In particular, we allow auxiliary task policies to utilize task features that are available only at training-time. This allows for fast learning of auxiliary policies, which subsequently generate good data for training the main, vision-based control policies. This method can be seen as an extension of the Scheduled Auxiliary Control (SAC-X) framework. We demonstrate the efficacy of our method by using both a simulated and real-world Ball-in-a-Cup game controlled by a robot arm. In simulation, our approach leads to significant learning speed-ups when compared to standard SAC-X. On the real robot we show that the task can be learned from-scratch, i.e., with no transfer from simulation and no imitation learning. Videos of our learned policies running on the real robot can be found at https://sites.google.com/view/rss-2019-sawyer-bic/.

READ FULL TEXT

page 5

page 6

research
05/20/2021

Towards a Sample Efficient Reinforcement Learning Pipeline for Vision Based Robotics

Deep Reinforcement learning holds the guarantee of empowering self-rulin...
research
07/02/2013

Multi-Task Policy Search

Learning policies that generalize across multiple tasks is an important ...
research
07/27/2018

Adapting control policies from simulation to reality using a pairwise loss

This paper proposes an approach to domain transfer based on a pairwise l...
research
11/24/2021

Ex-DoF: Expansion of Action Degree-of-Freedom with Virtual Camera Rotation for Omnidirectional Image

Inter-robot transfer of training data is a little explored topic in lear...
research
12/28/2020

Disentangled Planning and Control in Vision Based Robotics via Reward Machines

In this work we augment a Deep Q-Learning agent with a Reward Machine (D...
research
07/31/2023

Discovering Adaptable Symbolic Algorithms from Scratch

Autonomous robots deployed in the real world will need control policies ...
research
02/12/2019

Value constrained model-free continuous control

The naive application of Reinforcement Learning algorithms to continuous...

Please sign up or login with your details

Forgot password? Click here to reset