Analysis of diversity-accuracy tradeoff in image captioning

02/27/2020
by   Ruotian Luo, et al.
0

We investigate the effect of different model architectures, training objectives, hyperparameter settings and decoding procedures on the diversity of automatically generated image captions. Our results show that 1) simple decoding by naive sampling, coupled with low temperature is a competitive and fast method to produce diverse and accurate caption sets; 2) training with CIDEr-based reward using Reinforcement learning harms the diversity properties of the resulting generator, which cannot be mitigated by manipulating decoding parameters. In addition, we propose a new metric AllSPICE for evaluating both accuracy and diversity of a set of captions by a single value.

READ FULL TEXT

page 10

page 11

page 12

research
03/28/2019

Describing like humans: on diversity in image captioning

Recently, the state-of-the-art models for image captioning have overtake...
research
08/14/2019

Towards Diverse and Accurate Image Captions via Reinforcing Determinantal Point Process

Although significant progress has been made in the field of automatic im...
research
07/14/2020

Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets

A wide range of image captioning models has been developed, achieving si...
research
05/28/2022

Variational Transformer: A Framework Beyond the Trade-off between Accuracy and Diversity for Image Captioning

Accuracy and Diversity are two essential metrizable manifestations in ge...
research
12/19/2019

Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity

Image captioning as a multimodal task has drawn much interest in recent ...
research
10/19/2021

A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation

A creative image-and-text generative AI system mimics humans' extraordin...
research
10/25/2022

Information Filter upon Diversity-Improved Decoding for Diversity-Faithfulness Tradeoff in NLG

Some Natural Language Generation (NLG) tasks require both faithfulness a...

Please sign up or login with your details

Forgot password? Click here to reset