Visually Grounded, Situated Learning in Neural Models

05/29/2018
by   Alexander G. Ororbia, et al.
0

The theory of situated cognition postulates that language is inseparable from its physical context--words, phrases, and sentences must be learned in the context of the objects or concepts to which they refer. Yet, statistical language models are trained on words alone. This makes it impossible for language models to connect to the real world--the world described in the sentences presented to the model. In this paper, we examine the generalization ability of neural language models trained with a visual context. A multimodal connectionist language architecture based on the Differential State Framework is proposed, which outperforms its equivalent trained on language alone, even when no visual context is available at test time. Superior performance for language models trained with a visual context is robust across different languages and models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2022

What do Models Learn From Training on More Than Text? Measuring Visual Commonsense Knowledge

There are limitations in learning language from text alone. Therefore, r...
research
09/10/2019

Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study

Neural language models have achieved state-of-the-art performances on ma...
research
08/10/2023

Do Language Models Refer?

What do language models (LMs) do with language? Everyone agrees that the...
research
11/25/2019

Learning to Learn Words from Narrated Video

When we travel, we often encounter new scenarios we have never experienc...
research
05/04/2023

Few-shot Domain-Adaptive Visually-fused Event Detection from Text

Incorporating auxiliary modalities such as images into event detection m...
research
09/14/2018

Visual Speech Language Models

Language models (LM) are very powerful in lipreading systems. Language m...
research
10/24/2022

A Unified Framework for Pun Generation with Humor Principles

We propose a unified framework to generate both homophonic and homograph...

Please sign up or login with your details

Forgot password? Click here to reset