Understanding Grounded Language Learning Agents

10/26/2017
by   Felix Hill, et al.
0

Neural network-based systems can now learn to locate the referents of words and phrases in images, answer questions about visual scenes, and even execute symbolic instructions as first-person actors in partially-observable worlds. To achieve this so-called grounded language learning, models must overcome certain well-studied learning challenges that are also fundamental to infants learning their first words. While it is notable that models with no meaningful prior knowledge overcome these learning obstacles, AI researchers and practitioners currently lack a clear understanding of exactly how they do so. Here we address this question as a way of achieving a clearer general understanding of grounded language learning, both to inform future research and to improve confidence in model predictions. For maximum control and generality, we focus on a simple neural network-based language learning agent trained via policy-gradient methods to interpret synthetic linguistic instructions in a simulated 3D world. We apply experimental paradigms from developmental psychology to this agent, exploring the conditions under which established human biases and learning effects emerge. We further propose a novel way to visualise and analyse semantic representation in grounded language learning agents that yields a plausible computational account of the observed effects.

READ FULL TEXT

page 3

page 9

research
06/20/2017

Grounded Language Learning in a Simulated 3D World

We are increasingly surrounded by artificially intelligent technology th...
research
08/16/2023

Learning the meanings of function words from grounded language using a visual question answering model

Interpreting a seemingly-simple function word like "or", "behind", or "m...
research
10/18/2018

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Allowing humans to interactively train artificial agents to understand l...
research
06/12/2018

iParaphrasing: Extracting Visually Grounded Paraphrases via an Image

A paraphrase is a restatement of the meaning of a text in other words. P...
research
04/16/2021

VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks

Neural module networks (NMN) have achieved success in image-grounded tas...
research
04/09/2023

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes

Understanding the continuous states of objects is essential for task lea...
research
06/06/2022

Norm Participation Grounds Language

The striking recent advances in eliciting seemingly meaningful language ...

Please sign up or login with your details

Forgot password? Click here to reset