Multimodal Grounding for Language Processing

06/17/2018
by   Lisa Beinborn, et al.
0

This survey discusses how recent developments in multimodal processing facilitate conceptual grounding of language. We categorize the information flow in multimodal processing with respect to cognitive models of human information processing and analyze different methods for combining multimodal representations. Based on this methodological inventory, we discuss the benefit of multimodal grounding for a variety of language processing tasks and the challenges that arise. We particularly focus on multimodal grounding of verbs which play a crucial role for the compositional power of language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2021

Visual Grounding Strategies for Text-Only Natural Language Processing

Visual grounding is a promising path toward more robust and accurate Nat...
research
03/10/2021

What is Multimodality?

The last years have shown rapid developments in the field of multimodal ...
research
08/11/2023

Evidence of Human-Like Visual-Linguistic Integration in Multimodal Large Language Models During Predictive Language Processing

The advanced language processing abilities of large language models (LLM...
research
10/22/2022

A Visual Tour Of Current Challenges In Multimodal Language Models

Transformer models trained on massive text corpora have become the de fa...
research
02/16/2023

What A Situated Language-Using Agent Must be Able to Do: A Top-Down Analysis

Even in our increasingly text-intensive times, the primary site of langu...
research
03/25/2017

Learning to Predict: A Fast Re-constructive Method to Generate Multimodal Embeddings

Integrating visual and linguistic information into a single multimodal r...
research
06/08/2023

Dealing with Semantic Underspecification in Multimodal NLP

Intelligent systems that aim at mastering language as humans do must dea...

Please sign up or login with your details

Forgot password? Click here to reset