Moment-based Adversarial Training for Embodied Language Comprehension

04/02/2022
by   Shintaro Ishikawa, et al.
0

In this paper, we focus on a vision-and-language task in which a robot is instructed to execute household tasks. Given an instruction such as "Rinse off a mug and place it in the coffee maker," the robot is required to locate the mug, wash it, and put it in the coffee maker. This is challenging because the robot needs to break down the instruction sentences into subgoals and execute them in the correct order. On the ALFRED benchmark, the performance of state-of-the-art methods is still far lower than that of humans. This is partially because existing methods sometimes fail to infer subgoals that are not explicitly specified in the instruction sentences. We propose Moment-based Adversarial Training (MAT), which uses two types of moments for perturbation updates in adversarial training. We introduce MAT to the embedding spaces of the instruction, subgoals, and state representations to handle their varieties. We validated our method on the ALFRED benchmark, and the results demonstrated that our method outperformed the baseline method for all the metrics on the benchmark.

READ FULL TEXT

page 1

page 2

page 5

page 6

research
08/23/2020

DeComplex: Task planning from complex natural instructions by a collocating robot

As the number of robots in our daily surroundings like home, office, res...
research
03/15/2021

Adversarial Training is Not Ready for Robot Learning

Adversarial training is an effective method to train deep learning model...
research
08/08/2023

Empowering Vision-Language Models to Follow Interleaved Vision-Language Instructions

Multimodal Large Language Models (MLLMs) have recently sparked significa...
research
10/05/2021

Waypoint Models for Instruction-guided Navigation in Continuous Environments

Little inquiry has explicitly addressed the role of action spaces in lan...
research
07/23/2021

Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation

Language instruction plays an essential role in the natural language gro...
research
08/30/2021

Adaptive perturbation adversarial training: based on reinforcement learning

Adversarial training has become the primary method to defend against adv...
research
06/11/2018

A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions

This paper focuses on a multimodal language understanding method for car...

Please sign up or login with your details

Forgot password? Click here to reset