Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

03/26/2016
by   Yevgeni Berzak, et al.
0

Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types.

READ FULL TEXT

page 3

page 6

research
06/18/2020

Explainable and Discourse Topic-aware Neural Language Understanding

Marrying topic models and language models exposes language understanding...
research
01/20/2022

VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation

Existing multimodal machine translation (MMT) datasets consist of images...
research
08/01/2023

Discourse-Aware Text Simplification: From Complex Sentences to Linked Propositions

Sentences that present a complex syntax act as a major stumbling block f...
research
11/10/2019

INSET: Sentence Infilling with Inter-sentential Generative Pre-training

Missing sentence generation (or sentence infilling) fosters a wide range...
research
10/26/2022

Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding

Negation poses a challenge in many natural language understanding tasks....
research
02/07/2020

Incorporating Visual Semantics into Sentence Representations within a Grounded Space

Language grounding is an active field aiming at enriching textual repres...
research
09/10/2019

Multimodal Attention Branch Network for Perspective-Free Sentence Generation

In this paper, we address the automatic sentence generation of fetching ...

Please sign up or login with your details

Forgot password? Click here to reset