Self-Annotated Training for Controllable Image Captioning

10/16/2021
by   Zhangzi Zhu, et al.
0

The Controllable Image Captioning (CIC) task aims to generate captions conditioned on designated control signals. In this paper, we improve CIC from two aspects: 1) Existing reinforcement training methods are not applicable to structure-related CIC models due to the fact that the accuracy-based reward focuses mainly on contents rather than semantic structures. The lack of reinforcement training prevents the model from generating more accurate and controllable sentences. To solve the problem above, we propose a novel reinforcement training method for structure-related CIC models: Self-Annotated Training (SAT), where a recursive sampling mechanism (RSM) is designed to force the input control signal to match the actual output sentence. Extensive experiments conducted on MSCOCO show that our SAT method improves C-Transformer (XE) on CIDEr-D score from 118.6 to 130.1 in the length-control task and from 132.2 to 142.7 in the tense-control task, while maintaining more than 99% matching accuracy with the control signal. 2) We introduce a new control signal: sentence quality. Equipped with it, CIC models are able to generate captions of different quality levels as needed. Experiments show that without additional information of ground truth captions, models controlled by the highest level of sentence quality perform much better in accuracy than baseline models.

READ FULL TEXT
research
06/07/2022

Improving Image Captioning with Control Signal of Sentence Quality

In the dataset of image captioning, each image is aligned with several c...
research
03/22/2021

Human-like Controllable Image Captioning with Verb-specific Semantic Roles

Controllable Image Captioning (CIC) – generating image descriptions foll...
research
11/27/2022

CLID: Controlled-Length Image Descriptions with Limited Data

Controllable image captioning models generate human-like image descripti...
research
01/20/2021

Macroscopic Control of Text Generation for Image Captioning

Despite the fact that image captioning models have been able to generate...
research
12/02/2021

Controllable Video Captioning with an Exemplar Sentence

In this paper, we investigate a novel and challenging task, namely contr...
research
04/07/2020

Context-Aware Group Captioning via Self-Attention and Contrastive Features

While image captioning has progressed rapidly, existing works focus main...
research
05/18/2022

It Isn't Sh!tposting, It's My CAT Posting

In this paper, we describe a novel architecture which can generate hilar...

Please sign up or login with your details

Forgot password? Click here to reset