The Role of Syntactic Planning in Compositional Image Captioning

by   Emanuele Bugliarello, et al.

Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective-noun and noun-verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.


page 6

page 8

page 13

page 15


Compositional Generalization in Image Captioning

Image captioning models are usually evaluated on their ability to descri...

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

Image captioning research achieved breakthroughs in recent years by deve...

A Neural Compositional Paradigm for Image Captioning

Mainstream captioning models often follow a sequential structure to gene...

Image Captioning with Compositional Neural Module Networks

In image captioning where fluency is an important factor in evaluation, ...

Dropout during inference as a model for neurological degeneration in an image captioning network

We replicate a variation of the image captioning architecture by Vinyals...

Learning to generalize to new compositions in image understanding

Recurrent neural networks have recently been used for learning to descri...

Compositional Generalization without Trees using Multiset Tagging and Latent Permutations

Seq2seq models have been shown to struggle with compositional generaliza...

Code Repositories


Code and data for our paper "The Role of Syntactic Planning in Compositional Image Captioning", EACL 2021.

view repo

Please sign up or login with your details

Forgot password? Click here to reset