Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

07/28/2019
by   Michelle A. Lee, et al.
0

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is non-trivial to manually design a robot controller that combines these modalities which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. In this work, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.

READ FULL TEXT

page 1

page 8

page 13

research
10/24/2018

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

Contact-rich manipulation tasks in unstructured environments often requi...
research
10/25/2020

Proactive Action Visual Residual Reinforcement Learning for Contact-Rich Tasks Using a Torque-Controlled Robot

Contact-rich manipulation tasks are commonly found in modern manufacturi...
research
10/10/2019

From Visual Place Recognition to Navigation: Learning Sample-Efficient Control Policies across Diverse Real World Environments

Visual navigation tasks in real world environments often require both se...
research
11/29/2022

Survey on Self-Supervised Multimodal Representation Learning and Foundation Models

Deep learning has been the subject of growing interest in recent years. ...
research
08/30/2020

Deep Reinforcement Learning for Contact-Rich Skills Using Compliant Movement Primitives

In recent years, industrial robots have been installed in various indust...
research
09/26/2020

SEMI: Self-supervised Exploration via Multisensory Incongruity

Efficient exploration is a long-standing problem in reinforcement learni...
research
08/26/2020

Training Multimodal Systems for Classification with Multiple Objectives

We learn about the world from a diverse range of sensory information. Au...

Please sign up or login with your details

Forgot password? Click here to reset