CPTR: Full Transformer Network for Image Captioning

01/26/2021
by   Wei Liu, et al.
0

In this paper, we consider the image captioning task from a new sequence-to-sequence prediction perspective and propose CaPtion TransformeR (CPTR) which takes the sequentialized raw images as the input to Transformer. Compared to the "CNN+Transformer" design paradigm, our model can model global context at every encoder layer from the beginning and is totally convolution-free. Extensive experiments demonstrate the effectiveness of the proposed model and we surpass the conventional "CNN+Transformer" methods on the MSCOCO dataset. Besides, we provide detailed visualizations of the self-attention between patches in the encoder and the "words-to-patches" attention in the decoder thanks to the full Transformer architecture.

READ FULL TEXT
research
09/11/2021

Bornon: Bengali Image Captioning with Transformer-based Deep learning approach

Image captioning using Encoder-Decoder based approach where CNN is used ...
research
01/05/2023

Adaptively Clustering Neighbor Elements for Image Captioning

We design a novel global-local Transformer named Ada-ClustFormer (ACF) t...
research
05/24/2017

Dense Transformer Networks

The key idea of current deep learning methods for dense prediction is to...
research
03/29/2022

End-to-End Transformer Based Model for Image Captioning

CNN-LSTM based architectures have played an important role in image capt...
research
04/29/2020

Image Captioning through Image Transformer

Automatic captioning of images is a task that combines the challenges of...
research
12/13/2020

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

Transformer-based architectures have shown great success in image captio...
research
12/28/2021

Extended Self-Critical Pipeline for Transforming Videos to Text (TRECVID-VTT Task 2021) – Team: MMCUniAugsburg

The Multimedia and Computer Vision Lab of the University of Augsburg par...

Please sign up or login with your details

Forgot password? Click here to reset