Order-Embeddings of Images and Language

11/19/2015
by   Ivan Vendrov, et al.
0

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images. In this paper we advocate for explicitly modeling the partial order structure of this hierarchy. Towards this goal, we introduce a general method for learning ordered representations, and show how it can be applied to a variety of tasks involving images and language. We show that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.

READ FULL TEXT

page 8

page 12

research
04/18/2023

Hyperbolic Image-Text Representations

Visual and linguistic concepts naturally organize themselves in a hierar...
research
04/15/2022

Guiding Attention using Partial-Order Relationships for Image Captioning

The use of attention models for automated image captioning has enabled m...
research
04/04/2020

Evaluating Multimodal Representations on Visual Semantic Textual Similarity

The combination of visual and textual representations has produced excel...
research
06/14/2022

Comprehending and Ordering Semantics for Image Captioning

Comprehending the rich semantics in an image and ordering them in lingui...
research
10/19/2020

Image Captioning with Visual Object Representations Grounded in the Textual Modality

We present our work in progress exploring the possibilities of a shared ...
research
06/15/2020

Multi-Image Summarization: Textual Summary from a Set of Cohesive Images

Multi-sentence summarization is a well studied problem in NLP, while gen...
research
05/21/2021

Visual representation of negation: Real world data analysis on comic image design

There has been a widely held view that visual representations (e.g., pho...

Please sign up or login with your details

Forgot password? Click here to reset