Transform and Tell: Entity-Aware News Image Captioning

04/17/2020
by   Alasdair Tran, et al.
16

We propose an end-to-end model which generates captions for images embedded in news articles. News images present two key challenges: they rely on real-world knowledge, especially about named entities; and they typically have linguistically rich captions that include uncommon words. We address the first challenge by associating words in the caption with faces and objects in the image, via a multi-modal, multi-head attention mechanism. We tackle the second challenge with a state-of-the-art transformer language model that uses byte-pair-encoding to generate captions as a sequence of word parts. On the GoodNews dataset, our model outperforms the previous state of the art by a factor of four in CIDEr score (13 to 54). This performance gain comes from a unique combination of language models, word representation, image embeddings, face embeddings, object embeddings, and improvements in neural network design. We also introduce the NYTimes800k dataset which is 70 has higher article quality, and includes the locations of images within articles as an additional contextual cue.

READ FULL TEXT

page 3

page 8

page 17

research
08/16/2023

Visually-Aware Context Modeling for News Image Captioning

The goal of News Image Captioning is to generate an image caption accord...
research
10/08/2020

VisualNews : Benchmark and Challenges in Entity-aware Image Captioning

In this paper we propose VisualNews-Captioner, an entity-aware model for...
research
09/07/2021

Journalistic Guidelines Aware News Image Captioning

The task of news article image captioning aims to generate descriptive a...
research
04/02/2019

Good News, Everyone! Context driven entity-aware captioning for news images

Current image captioning systems perform at a merely descriptive level, ...
research
08/02/2018

SWDE : A Sub-Word And Document Embedding Based Engine for Clickbait Detection

In order to expand their reach and increase website ad revenue, media ou...
research
10/10/2022

Generating image captions with external encyclopedic knowledge

Accurately reporting what objects are depicted in an image is largely a ...
research
12/01/2022

Focus! Relevant and Sufficient Context Selection for News Image Captioning

News Image Captioning requires describing an image by leveraging additio...

Please sign up or login with your details

Forgot password? Click here to reset