CLID: Controlled-Length Image Descriptions with Limited Data

11/27/2022
by   Elad Hirsch, et al.
0

Controllable image captioning models generate human-like image descriptions, enabling some kind of control over the generated captions. This paper focuses on controlling the caption length, i.e. a short and concise description or a long and detailed one. Since existing image captioning datasets contain mostly short captions, generating long captions is challenging. To address the shortage of long training examples, we propose to enrich the dataset with varying-length self-generated captions. These, however, might be of varying quality and are thus unsuitable for conventional training. We introduce a novel training strategy that selects the data points to be used at different times during the training. Our method dramatically improves the length-control abilities, while exhibiting SoTA performance in terms of caption quality. Our approach is general and is shown to be applicable also to paragraph generation.

READ FULL TEXT

page 3

page 7

research
05/29/2020

Controlling Length in Image Captioning

We develop and evaluate captioning models that allow control of caption ...
research
09/15/2023

PatFig: Generating Short and Long Captions for Patent Figures

This paper introduces Qatent PatFig, a novel large-scale patent figure d...
research
08/28/2021

Goal-driven text descriptions for images

A big part of achieving Artificial General Intelligence(AGI) is to build...
research
09/28/2021

CIDEr-R: Robust Consensus-based Image Description Evaluation

This paper shows that CIDEr-D, a traditional evaluation metric for image...
research
07/19/2020

Length-Controllable Image Captioning

The last decade has witnessed remarkable progress in the image captionin...
research
10/16/2021

Self-Annotated Training for Controllable Image Captioning

The Controllable Image Captioning (CIC) task aims to generate captions c...
research
06/07/2022

Improving Image Captioning with Control Signal of Sentence Quality

In the dataset of image captioning, each image is aligned with several c...

Please sign up or login with your details

Forgot password? Click here to reset