Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

10/24/2018
by   Michelle A. Lee, et al.
4

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. Results for simulated and real robot experiments are presented.

READ FULL TEXT

page 1

page 5

page 6

research
07/28/2019

Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

Contact-rich manipulation tasks in unstructured environments often requi...
research
10/25/2020

Proactive Action Visual Residual Reinforcement Learning for Contact-Rich Tasks Using a Torque-Controlled Robot

Contact-rich manipulation tasks are commonly found in modern manufacturi...
research
10/10/2019

From Visual Place Recognition to Navigation: Learning Sample-Efficient Control Policies across Diverse Real World Environments

Visual navigation tasks in real world environments often require both se...
research
08/30/2020

Deep Reinforcement Learning for Contact-Rich Skills Using Compliant Movement Primitives

In recent years, industrial robots have been installed in various indust...
research
09/26/2021

Finite State Machine Policies Modulating Trajectory Generator

Deep reinforcement learning (deep RL) has emerged as an effective tool f...
research
10/03/2022

That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation

Learning to produce contact-rich, dynamic behaviors from raw sensory dat...
research
07/22/2020

Understanding Multi-Modal Perception Using Behavioral Cloning for Peg-In-a-Hole Insertion Tasks

One of the main challenges in peg-in-a-hole (PiH) insertion tasks is in ...

Please sign up or login with your details

Forgot password? Click here to reset