Contrastive Semantic Similarity Learning for Image Captioning Evaluation with Intrinsic Auto-encoder

06/29/2021
by   Chao Zeng, et al.
0

Automatically evaluating the quality of image captions can be very challenging since human language is quite flexible that there can be various expressions for the same meaning. Most of the current captioning metrics rely on token level matching between candidate caption and the ground truth label sentences. It usually neglects the sentence-level information. Motivated by the auto-encoder mechanism and contrastive representation learning advances, we propose a learning-based metric for image captioning, which we call Intrinsic Image Captioning Evaluation(I^2CE). We develop three progressive model structures to learn the sentence level representations–single branch model, dual branches model, and triple branches model. Our empirical tests show that I^2CE trained with dual branches structure achieves better consistency with human judgments to contemporary image captioning evaluation metrics. Furthermore, We select several state-of-the-art image captioning models and test their performances on the MS COCO dataset concerning both contemporary metrics and the proposed I^2CE. Experiment results show that our proposed method can align well with the scores generated from other contemporary metrics. On this concern, the proposed metric could serve as a novel indicator of the intrinsic information between captions, which may be complementary to the existing ones.

READ FULL TEXT

page 1

page 7

page 8

research
12/14/2020

Intrinsic Image Captioning Evaluation

The image captioning task is about to generate suitable descriptions fro...
research
06/17/2018

Learning to Evaluate Image Captioning

Evaluation metrics for image captioning face two challenges. Firstly, co...
research
01/04/2022

StyleM: Stylized Metrics for Image Captioning Built with Contrastive N-grams

In this paper, we build two automatic evaluation metrics for evaluating ...
research
09/06/2018

Object Hallucination in Image Captioning

Despite continuously improving performance, contemporary image captionin...
research
03/06/2020

Show, Edit and Tell: A Framework for Editing Image Captions

Most image captioning frameworks generate captions directly from images,...
research
06/12/2015

Technical Report: Image Captioning with Semantically Similar Images

This report presents our submission to the MS COCO Captioning Challenge ...
research
12/06/2018

Auto-Encoding Graphical Inductive Bias for Descriptive Image Captioning

We propose Scene Graph Auto-Encoder (SGAE) that incorporates the languag...

Please sign up or login with your details

Forgot password? Click here to reset