A Hierarchical Approach for Generating Descriptive Image Paragraphs

11/20/2016
by   Jonathan Krause, et al.
0

Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail. While one new captioning approach, dense captioning, can potentially describe images in finer levels of detail by captioning many regions within an image, it in turn is unable to produce a coherent story for an image. In this paper we overcome these limitations by generating entire paragraphs for describing images, which can tell detailed, unified stories. We develop a model that decomposes both images and paragraphs into their constituent parts, detecting semantic regions in images and using a hierarchical recurrent neural network to reason about language. Linguistic analysis confirms the complexity of the paragraph generation task, and thorough experiments on a new dataset of image and paragraph pairs demonstrate the effectiveness of our approach.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
09/13/2018

Image Captioning based on Deep Reinforcement Learning

Recently it has shown that the policy-gradient methods for reinforcement...
research
05/29/2023

Text-Only Image Captioning with Multi-Context Data Generation

Text-only Image Captioning (TIC) is an approach that aims to construct a...
research
02/03/2021

L2C: Describing Visual Differences Needs Semantic Understanding of Individuals

Recent advances in language and vision push forward the research of capt...
research
03/25/2021

Describing and Localizing Multiple Changes with Transformers

Change captioning tasks aim to detect changes in image pairs observed be...
research
06/15/2020

Multi-Image Summarization: Textual Summary from a Set of Cohesive Images

Multi-sentence summarization is a well studied problem in NLP, while gen...
research
09/03/2018

Diverse and Coherent Paragraph Generation from Images

Paragraph generation from images, which has gained popularity recently, ...
research
01/14/2019

Image Based Review Text Generation with Emotional Guidance

In the current field of computer vision, automatically generating texts ...

Please sign up or login with your details

Forgot password? Click here to reset