Multimodal representation models for prediction and control from partial information

10/09/2019
by   Martina Zambelli, et al.
0

Similar to humans, robots benefit from interacting with their environment through a number of different sensor modalities, such as vision, touch, sound. However, learning from different sensor modalities is difficult, because the learning model must be able to handle diverse types of signals, and learn a coherent representation even when parts of the sensor inputs are missing. In this paper, a multimodal variational autoencoder is proposed to enable an iCub humanoid robot to learn representations of its sensorimotor capabilities from different sensor modalities. The proposed model is able to (1) reconstruct missing sensory modalities, (2) predict the sensorimotor state of self and the visual trajectories of other agents actions, and (3) control the agent to imitate an observed visual trajectory. Also, the proposed multimodal variational autoencoder can capture the kinematic redundancy of the robot motion through the learned probability distribution. Training multimodal models is not trivial due to the combinatorial complexity given by the possibility of missing modalities. We propose a strategy to train multimodal models, which successfully achieves improved performance of different reconstruction models. Finally, extensive experiments have been carried out using an iCub humanoid robot, showing high performance in multiple reconstruction, prediction and imitation tasks.

READ FULL TEXT

page 8

page 10

research
08/10/2020

Multimodal Deep Generative Models for Trajectory Prediction: A Conditional Variational Autoencoder Approach

Human behavior prediction models enable robots to anticipate how humans ...
research
08/21/2018

LRMM: Learning to Recommend with Missing Modalities

Multimodal learning has shown promising performance in content-based rec...
research
11/23/2020

Multimodal dynamics modeling for off-road autonomous vehicles

Dynamics modeling in outdoor and unstructured environments is difficult ...
research
04/06/2022

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

We present Perceive-Represent-Generate (PRG), a novel three-stage framew...
research
08/26/2020

Training Multimodal Systems for Classification with Multiple Objectives

We learn about the world from a diverse range of sensory information. Au...
research
03/07/2021

Multimodal VAE Active Inference Controller

Active inference, a theoretical construct inspired by brain processing, ...

Please sign up or login with your details

Forgot password? Click here to reset