Partially-Supervised Image Captioning

06/15/2018
by   Peter Anderson, et al.
6

Image captioning models are becoming increasingly successful at describing the content of images in restricted domains. However, if these models are to function in the wild - for example, as aids for the visually impaired - a much larger number and variety of visual concepts must be understood. In this work, we teach image captioning models new visual concepts with partial supervision, such as available from object detection and image label datasets. As these datasets contain text fragments rather than complete captions, we formulate this problem as learning from incomplete data. To flexibly characterize our uncertainty about the unobserved complete sequence, we represent each incomplete training sequence with its own finite state automaton encoding acceptable complete sequences. We then propose a novel algorithm for training sequence models, such as recurrent neural networks, on incomplete sequences specified in this manner. In the context of image captioning, our method lifts the restriction that previously required image captioning models to be trained on paired image-sentence corpora only, or otherwise required specialized model architectures to take advantage of alternative data modalities. Applying our approach to an existing neural captioning model, we achieve state of the art results on the novel object captioning task using the COCO dataset. We further show that we can train a captioning model to describe new visual concepts from the Open Images dataset while maintaining competitive COCO evaluation scores.

READ FULL TEXT

page 7

page 8

page 13

page 14

research
12/20/2018

nocaps: novel object captioning at scale

Image captioning models have achieved impressive results on datasets con...
research
04/25/2015

Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images

In this paper, we address the task of learning novel visual concepts, an...
research
03/30/2016

Rich Image Captioning in the Wild

We present an image caption system that addresses new challenges of auto...
research
09/10/2021

Partially-supervised novel object captioning leveraging context from paired data

In this paper, we propose an approach to improve image captioning soluti...
research
04/21/2017

Attend to You: Personalized Image Captioning with Context Sequence Memory Networks

We address personalization issues of image captioning, which have not be...
research
12/04/2020

Understanding Guided Image Captioning Performance across Domains

Image captioning models generally lack the capability to take into accou...
research
06/02/2021

Learning to Select: A Fully Attentive Approach for Novel Object Captioning

Image captioning models have lately shown impressive results when applie...

Please sign up or login with your details

Forgot password? Click here to reset