Self-Supervised Correspondence in Visuomotor Policy Learning

by   Peter Florence, et al.

In this paper we explore using self-supervised correspondence for improving the generalization performance and sample efficiency of visuomotor policy learning. Prior work has primarily used approaches such as autoencoding, pose-based losses, and end-to-end policy optimization in order to train the visual portion of visuomotor policies. We instead propose an approach using self-supervised dense visual correspondence training, and show this enables visuomotor policy learning with surprisingly high generalization performance with modest amounts of data: using imitation learning, we demonstrate extensive hardware validation on challenging manipulation tasks with as few as 50 demonstrations. Our learned policies can generalize across classes of objects, react to deformable object configurations, and manipulate textureless symmetrical objects in a variety of backgrounds, all with closed-loop, real-time vision-based policies. Simulated imitation learning experiments suggest that correspondence training offers sample complexity and generalization benefits compared to autoencoding and end-to-end training.


page 1

page 5

page 6


Practical Imitation Learning in the Real World via Task Consistency Loss

Recent work in visual end-to-end learning for robotics has shown the pro...

Self-supervised Learning of Dense Shape Correspondence

We introduce the first completely unsupervised correspondence learning a...

Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation

In recent years, policy learning methods using either reinforcement or i...

Generalization Guarantees for Multi-Modal Imitation Learning

Control policies from imitation learning can often fail to generalize to...

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Learned visuomotor policies have shown considerable success as an altern...

The Surprising Effectiveness of Representation Learning for Visual Imitation

While visual imitation learning offers one of the most effective ways of...

Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation

Vision-based robotics often separates the control loop into one module f...

Code Repositories