L2C: Describing Visual Differences Needs Semantic Understanding of Individuals

02/03/2021
by   An Yan, et al.
0

Recent advances in language and vision push forward the research of captioning a single image to describing visual differences between image pairs. Suppose there are two images, I_1 and I_2, and the task is to generate a description W_1,2 comparing them, existing methods directly model I_1, I_2 -> W_1,2 mapping without the semantic understanding of individuals. In this paper, we introduce a Learning-to-Compare (L2C) model, which learns to understand the semantic structures of these two images and compare them while learning to describe each one. We demonstrate that L2C benefits from a comparison between explicit semantic representations and single-image captions, and generalizes better on the new testing image pairs. It outperforms the baseline on both automatic evaluation and human evaluation for the Birds-to-Words dataset.

READ FULL TEXT
research
06/18/2019

Expressing Visual Relationships via Language

Describing images with text is a fundamental problem in vision-language ...
research
11/20/2016

A Hierarchical Approach for Generating Descriptive Image Paragraphs

Recent progress on image captioning has made it possible to generate nov...
research
09/08/2021

RefineCap: Concept-Aware Refinement for Image Captioning

Automatically translating images to texts involves image scene understan...
research
03/31/2018

Compare and Contrast: Learning Prominent Visual Differences

Relative attribute models can compare images in terms of all detected pr...
research
11/21/2015

Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets

We consider the visual sentiment task of mapping an image to an adjectiv...
research
09/15/2023

Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

Recently, the development of pre-trained vision language foundation mode...
research
04/07/2020

Context-Aware Group Captioning via Self-Attention and Contrastive Features

While image captioning has progressed rapidly, existing works focus main...

Please sign up or login with your details

Forgot password? Click here to reset