Machine-in-the-Loop Rewriting for Creative Image Captioning

11/07/2021
by   Vishakh Padmakumar, et al.
0

Machine-in-the-loop writing aims to enable humans to collaborate with models to complete their writing tasks more effectively. Prior work has found that providing humans a machine-written draft or sentence-level continuations has limited success since the generated text tends to deviate from humans' intention. To allow the user to retain control over the content, we train a rewriting model that, when prompted, modifies specified spans of text within the user's original draft to introduce descriptive and figurative elements locally in the text. We evaluate the model on its ability to collaborate with humans on the task of creative image captioning. On a user study through Amazon Mechanical Turk, our model is rated to be more helpful than a baseline infilling language model. In addition, third-party evaluation shows that users write more descriptive and figurative captions when collaborating with our model compared to completing the task alone.

READ FULL TEXT

page 2

page 4

research
11/15/2022

PromptCap: Prompt-Guided Task-Aware Image Captioning

Image captioning aims to describe an image with a natural language sente...
research
01/31/2020

iCap: Interative Image Captioning with Predictive Text

In this paper we study a brand new topic of interactive image captioning...
research
10/25/2018

Engaging Image Captioning Via Personality

Standard image captioning tasks such as COCO and Flickr30k are factual, ...
research
04/18/2021

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Image captioning has conventionally relied on reference-based automatic ...
research
04/21/2017

Attend to You: Personalized Image Captioning with Context Sequence Memory Networks

We address personalization issues of image captioning, which have not be...
research
06/06/2023

Putting Humans in the Image Captioning Loop

Image Captioning (IC) models can highly benefit from human feedback in t...
research
01/19/2019

Binary Image Selection (BISON): Interpretable Evaluation of Visual Grounding

Providing systems the ability to relate linguistic and visual content is...

Please sign up or login with your details

Forgot password? Click here to reset