Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

09/30/2020
by   Xiangxi Shi, et al.
0

Change Captioning is a task that aims to describe the difference between images with natural language. Most existing methods treat this problem as a difference judgment without the existence of distractors, such as viewpoint changes. However, in practice, viewpoint changes happen often and can overwhelm the semantic difference to be described. In this paper, we propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task. Moreover, we further simulate the attention preference of humans and propose a novel reinforcement learning process to fine-tune the attention directly with language evaluation rewards. Extensive experimental results show that our method outperforms the state-of-the-art approaches by a large margin in both Spot-the-Diff and CLEVR-Change datasets.

READ FULL TEXT
research
01/08/2019

Viewpoint Invariant Change Captioning

The ability to detect that something has changed in an environment is va...
research
10/20/2021

R^3Net:Relation-embedded Representation Reconstruction Network for Change Captioning

Change captioning is to use a natural language sentence to describe the ...
research
03/25/2021

Describing and Localizing Multiple Changes with Transformers

Change captioning tasks aim to detect changes in image pairs observed be...
research
09/15/2023

Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

Recently, the development of pre-trained vision language foundation mode...
research
03/06/2023

Neighborhood Contrastive Transformer for Change Captioning

Change captioning is to describe the semantic change between a pair of s...
research
10/14/2019

Tell-the-difference: Fine-grained Visual Descriptor via a Discriminating Referee

In this paper, we investigate a novel problem of telling the difference ...
research
06/07/2019

Figure Captioning with Reasoning and Sequence-Level Training

Figures, such as bar charts, pie charts, and line plots, are widely used...

Please sign up or login with your details

Forgot password? Click here to reset