Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual Imitation Learning

05/30/2022
by   Maximilian Du, et al.
4

Humans are capable of completing a range of challenging manipulation tasks that require reasoning jointly over modalities such as vision, touch, and sound. Moreover, many such tasks are partially-observed; for example, taking a notebook out of a backpack will lead to visual occlusion and require reasoning over the history of audio or tactile information. While robust tactile sensing can be costly to capture on robots, microphones near or on a robot's gripper are a cheap and easy way to acquire audio feedback of contact events, which can be a surprisingly valuable data source for perception in the absence of vision. Motivated by the potential for sound to mitigate visual occlusion, we aim to learn a set of challenging partially-observed manipulation tasks from visual and audio inputs. Our proposed system learns these tasks by combining offline imitation learning from a modest number of tele-operated demonstrations and online finetuning using human provided interventions. In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by  20 of challenging, partially-observed tasks on a Franka Emika Panda robot, like extracting keys from a bag, with a 70 that does not use audio.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

page 8

research
03/05/2023

Seq2Seq Imitation Learning for Tactile Feedback-based Manipulation

Robot control for tactile feedback-based manipulation can be difficult d...
research
07/02/2023

RH20T: A Robotic Dataset for Learning Diverse Skills in One-Shot

A key challenge in robotic manipulation in open domains is how to acquir...
research
03/13/2023

Audio-based Roughness Sensing and Tactile Feedback for Haptic Perception in Telepresence

Haptic perception is incredibly important for immersive teleoperation of...
research
09/21/2023

See to Touch: Learning Tactile Dexterity through Visual Incentives

Equipping multi-fingered robots with tactile sensing is crucial for achi...
research
08/01/2021

Transformer-based deep imitation learning for dual-arm robot manipulation

Deep imitation learning is promising for solving dexterous manipulation ...
research
09/14/2023

Imitation Learning-based Visual Servoing for Tracking Moving Objects

In everyday life collaboration tasks between human operators and robots,...

Please sign up or login with your details

Forgot password? Click here to reset