A Self-Explainable Stylish Image Captioning Framework via Multi-References

10/20/2021
by   Chengxi Li, et al.
0

In this paper, we propose to build a stylish image captioning model through a Multi-style Multi modality mechanism (2M). We demonstrate that with 2M, we can build an effective stylish captioner and that multi-references produced by the model can also support explaining the model through identifying erroneous input features on faulty examples. We show how this 2M mechanism can be used to build stylish captioning models and show how these models can be utilized to provide explanations of likely errors in the models.

READ FULL TEXT

page 5

page 7

page 10

research
05/29/2020

Controlling Length in Image Captioning

We develop and evaluate captioning models that allow control of caption ...
research
04/05/2023

Towards Self-Explainability of Deep Neural Networks with Heatmap Captioning and Large-Language Models

Heatmaps are widely used to interpret deep neural networks, particularly...
research
08/28/2018

Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation

Neural text generation, including neural machine translation, image capt...
research
03/20/2021

3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model

In this paper, we build a multi-style generative model for stylish image...
research
10/04/2021

Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning

Explaining an image with missing or non-existent objects is known as obj...
research
01/26/2023

Style-Aware Contrastive Learning for Multi-Style Image Captioning

Existing multi-style image captioning methods show promising results in ...
research
11/17/2017

ADVISE: Symbolism and External Knowledge for Decoding Advertisements

In order to convey the most content in their limited space, advertisemen...

Please sign up or login with your details

Forgot password? Click here to reset