Learning Flexible Translation between Robot Actions and Language Descriptions

07/15/2022
by   Ozan Özdemir, et al.
0

Handling various robot action-language translation tasks flexibly is an essential requirement for natural interaction between a robot and a human. Previous approaches require change in the configuration of the model architecture per task during inference, which undermines the premise of multi-task learning. In this work, we propose the paired gated autoencoders (PGAE) for flexible translation between robot actions and language descriptions in a tabletop object manipulation scenario. We train our model in an end-to-end fashion by pairing each action with appropriate descriptions that contain a signal informing about the translation direction. During inference, our model can flexibly translate from action to language and vice versa according to the given language signal. Moreover, with the option to use a pretrained language model as the language encoder, our model has the potential to recognise unseen natural language input. Another capability of our model is that it can recognise and imitate actions of another agent by utilising robot demonstrations. The experiment results highlight the flexible bidirectional translation capabilities of our approach alongside with the ability to generalise to the actions of the opposite-sitting agent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2022

Language Model-Based Paired Variational Autoencoders for Robotic Language Learning

Human infants learn language while interacting with their environment in...
research
03/08/2022

Learning Bidirectional Translation between Descriptions and Actions with Small Paired Data

This study achieved bidirectional translation between descriptions and a...
research
05/27/2019

Harry Potter and the Action Prediction Challenge from Natural Language

We explore the challenge of action prediction from textual descriptions ...
research
01/09/2023

Learning Bidirectional Action-Language Translation with Limited Supervision and Incongruent Input

Human infant learning happens during exploration of the environment, by ...
research
09/02/2023

Developmental Scaffolding with Large Language Models

Exploratoration and self-observation are key mechanisms of infant sensor...
research
03/23/2020

Caption Generation of Robot Behaviors based on Unsupervised Learning of Action Segments

Bridging robot action sequences and their natural language captions is a...
research
03/13/2022

Summarizing a virtual robot's past actions in natural language

We propose and demonstrate the task of giving natural language summaries...

Please sign up or login with your details

Forgot password? Click here to reset