The Role of Syntactic Planning in Compositional Image Captioning

01/28/2021
by   Emanuele Bugliarello, et al.
14

Image captioning has focused on generalizing to images drawn from the same distribution as the training set, and not to the more challenging problem of generalizing to different distributions of images. Recently, Nikolaus et al. (2019) introduced a dataset to assess compositional generalization in image captioning, where models are evaluated on their ability to describe images with unseen adjective-noun and noun-verb compositions. In this work, we investigate different methods to improve compositional generalization by planning the syntactic structure of a caption. Our experiments show that jointly modeling tokens and syntactic tags enhances generalization in both RNN- and Transformer-based models, while also improving performance on standard metrics.

READ FULL TEXT

page 6

page 8

page 13

page 15

research
09/10/2019

Compositional Generalization in Image Captioning

Image captioning models are usually evaluated on their ability to descri...
research
07/12/2022

A Baseline for Detecting Out-of-Distribution Examples in Image Captioning

Image captioning research achieved breakthroughs in recent years by deve...
research
10/23/2018

A Neural Compositional Paradigm for Image Captioning

Mainstream captioning models often follow a sequential structure to gene...
research
07/10/2020

Image Captioning with Compositional Neural Module Networks

In image captioning where fluency is an important factor in evaluation, ...
research
08/11/2018

Dropout during inference as a model for neurological degeneration in an image captioning network

We replicate a variation of the image captioning architecture by Vinyals...
research
08/27/2016

Learning to generalize to new compositions in image understanding

Recurrent neural networks have recently been used for learning to descri...
research
05/26/2023

Compositional Generalization without Trees using Multiset Tagging and Latent Permutations

Seq2seq models have been shown to struggle with compositional generaliza...

Please sign up or login with your details

Forgot password? Click here to reset