The Treachery of Images: Bayesian Scene Keypoints for Deep Policy Learning in Robotic Manipulation

05/08/2023
by   Jan Ole von Hartz, et al.
0

In policy learning for robotic manipulation, sample efficiency is of paramount importance. Thus, learning and extracting more compact representations from camera observations is a promising avenue. However, current methods often assume full observability of the scene and struggle with scale invariance. In many tasks and settings, this assumption does not hold as objects in the scene are often occluded or lie outside the field of view of the camera, rendering the camera observation ambiguous with regard to their location. To tackle this problem, we present BASK, a Bayesian approach to tracking scale-invariant keypoints over time. Our approach successfully resolves inherent ambiguities in images, enabling keypoint tracking on symmetrical objects and occluded and out-of-view objects. We employ our method to learn challenging multi-object robot manipulation tasks from wrist camera observations and demonstrate superior utility for policy learning compared to other representation learning techniques. Furthermore, we show outstanding robustness towards disturbances such as clutter, occlusions, and noisy depth measurements, as well as generalization to unseen objects both in simulation and real-world robotic experiments.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 9

page 10

page 11

research
05/17/2022

Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation

In recent years, policy learning methods using either reinforcement or i...
research
03/25/2021

Multi-view Fusion for Multi-level Robotic Scene Understanding

We present a system for multi-level scene awareness for robotic manipula...
research
03/01/2019

Multi-Object Representation Learning with Iterative Variational Inference

Human perception is structured around objects which form the basis for o...
research
03/13/2023

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

The use of multi-camera views simultaneously has been shown to improve t...
research
09/21/2018

3D Move to See: Multi-perspective visual servoing for improving object views with semantic segmentation

In this paper, we present a new approach to visual servoing for robotics...
research
11/20/2018

Reinforcement Learning of Active Vision forManipulating Objects under Occlusions

We consider artificial agents that learn to jointly control their grippe...
research
11/03/2020

Learning 3D Dynamic Scene Representations for Robot Manipulation

3D scene representation for robot manipulation should capture three key ...

Please sign up or login with your details

Forgot password? Click here to reset