3M: Multi-style image caption generation using Multi-modality features under Multi-UPDOWN model

03/20/2021
by   Chengxi Li, et al.
0

In this paper, we build a multi-style generative model for stylish image captioning which uses multi-modality image features, ResNeXt features and text features generated by DenseCap. We propose the 3M model, a Multi-UPDOWN caption model that encodes multi-modality features and decode them to captions. We demonstrate the effectiveness of our model on generating human-like captions by examining its performance on two datasets, the PERSONALITY-CAPTIONS dataset and the FlickrStyle10K dataset. We compare against a variety of state-of-the-art baselines on various automatic NLP metrics such as BLEU, ROUGE-L, CIDEr, SPICE, etc. A qualitative study has also been done to verify our 3M model can be used for generating different stylized captions.

READ FULL TEXT
research
05/02/2017

STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset

In recent years, automatic generation of image descriptions (captions), ...
research
07/20/2023

FigCaps-HF: A Figure-to-Caption Generative Framework and Benchmark with Human Feedback

Captions are crucial for understanding scientific visualizations and doc...
research
10/10/2022

Generating image captions with external encyclopedic knowledge

Accurately reporting what objects are depicted in an image is largely a ...
research
01/26/2023

Paraphrase Acquisition from Image Captions

We propose to use captions from the Web as a previously underutilized re...
research
10/20/2021

A Self-Explainable Stylish Image Captioning Framework via Multi-References

In this paper, we propose to build a stylish image captioning model thro...
research
09/15/2023

PatFig: Generating Short and Long Captions for Patent Figures

This paper introduces Qatent PatFig, a novel large-scale patent figure d...
research
02/11/2022

Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation

This work aims at generating captions for soccer videos using deep learn...

Please sign up or login with your details

Forgot password? Click here to reset