Interactive Grounded Language Acquisition and Generalization in a 2D World

01/31/2018
by   Haonan Yu, et al.
0

We build a virtual agent for learning language in a 2D maze-like world. The agent sees images of the surrounding environment, listens to a virtual teacher, and takes actions to receive rewards. It interactively learns the teacher's language from scratch based on two language use cases: sentence-directed navigation and question answering. It learns simultaneously the visual representations of the world, the language, and the action control. By disentangling language grounding from other computational routines and sharing a concept detection function between language grounding and prediction, the agent reliably interpolates and extrapolates to interpret sentences that contain new word combinations or new words missing from training sentences. The new words are transferred from the answers of language prediction. Such a language ability is trained and evaluated on a population of over 1.6 million distinct sentences consisting of 119 object words, 8 color words, 9 spatial-relation words, and 50 grammatical words. The proposed model significantly outperforms five comparison methods for interpreting zero-shot sentences. In addition, we demonstrate human-interpretable intermediate outputs of the model in the appendix.

READ FULL TEXT

page 16

page 18

page 19

page 20

page 21

page 22

page 23

page 24

research
03/28/2017

A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment

We tackle a task where an agent learns to navigate in a 2D maze-like env...
research
04/26/2019

The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision

We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that lear...
research
06/16/2023

Learning to Summarize and Answer Questions about a Virtual Robot's Past Actions

When robots perform long action sequences, users will want to easily and...
research
02/07/2023

Learning Manner of Execution from Partial Corrections

Some actions must be executed in different ways depending on the context...
research
05/22/2018

Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents

Recently there has been a rising interest in training agents, embodied i...
research
05/19/2022

Sentences as connection paths: A neural language architecture of sentence structure in the brain

This article presents a neural language architecture of sentence structu...

Please sign up or login with your details

Forgot password? Click here to reset