IC^3: Image Captioning by Committee Consensus

02/02/2023
by   David M. Chan, et al.
0

If you ask a human to describe an image, they might do so in a thousand different ways. Traditionally, image captioning models are trained to approximate the reference distribution of image captions, however, doing so encourages captions that are viewpoint-impoverished. Such captions often focus on only a subset of the possible details, while ignoring potentially useful information in the scene. In this work, we introduce a simple, yet novel, method: "Image Captioning by Committee Consensus" (IC^3), designed to generate a single caption that captures high-level details from several viewpoints. Notably, humans rate captions produced by IC^3 at least as helpful as baseline SOTA models more than two thirds of the time, and IC^3 captions can improve the performance of SOTA automated recall systems by up to 84 for visual description. Our code is publicly available at https://github.com/DavidMChan/caption-by-committee

READ FULL TEXT

page 6

page 17

page 18

page 19

page 20

page 21

page 24

page 25

research
05/29/2020

Controlling Length in Image Captioning

We develop and evaluate captioning models that allow control of caption ...
research
08/04/2021

Question-controlled Text-aware Image Captioning

For an image with multiple scene texts, different people may be interest...
research
12/06/2022

Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning

Discriminativeness is a desirable feature of image captions: captions sh...
research
06/06/2023

SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning

In scholarly documents, figures provide a straightforward way of communi...
research
04/15/2018

Pragmatically Informative Image Captioning with Character-Level Reference

We combine a neural image captioner with a Rational Speech Acts (RSA) mo...
research
03/08/2020

Better Captioning with Sequence-Level Exploration

Sequence-level learning objective has been widely used in captioning tas...
research
09/16/2022

Belief Revision based Caption Re-ranker with Visual Semantic Information

In this work, we focus on improving the captions generated by image-capt...

Please sign up or login with your details

Forgot password? Click here to reset