Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training

04/23/2018
by   Bei Liu, et al.
0

Automatic generation of natural language from images has attracted extensive attention. In this paper, we take one step further to investigate generation of poetic language (with multiple lines) to an image for automatic poetry creation. This task involves multiple challenges, including discovering poetic clues from the image (e.g., hope from green), and generating poems to satisfy both relevance to the image and poeticness in language level. To solve the above challenges, we formulate the task of poem generation into two correlated sub-tasks by multi-adversarial training via policy gradient, through which the cross-modal relevance and poetic language style can be ensured. To extract poetic clues from images, we propose to learn a deep coupled visual-poetic embedding, in which the poetic representation from objects, sentiments and scenes in an image can be jointly learned. Two discriminative networks are further introduced to guide the poem generation, including a multi-modal discriminator and a poem-style discriminator. To facilitate the research, we have collected two poem datasets by human annotators with two distinct properties: 1) the first human annotated image-to-poem pair dataset (with 8,292 pairs in total), and 2) to-date the largest public English poem corpus dataset (with 92,265 different poems in total). Extensive experiments are conducted with 8K images generated with our model, among which 1.5K image are randomly picked for evaluation. Both objective and subjective evaluations show the superior performances against the state-of-art methods for poem generation from images. Turing test carried out with over 500 human subjects, among which 30 evaluators are poetry experts, demonstrates the effectiveness of our approach.

READ FULL TEXT

page 10

page 11

research
06/10/2021

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language Generation

Automatic evaluations for natural language generation (NLG) conventional...
research
04/30/2020

Boosting Naturalness of Language in Task-oriented Dialogues via Adversarial Training

The natural language generation (NLG) module in a task-oriented dialogue...
research
11/20/2022

How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation

Sarcasm generation has been investigated in previous studies by consider...
research
10/18/2017

Learning Social Image Embedding with Deep Multimodal Attention Networks

Learning social media data embedding by deep models has attracted extens...
research
03/10/2022

StyleBabel: Artistic Style Tagging and Captioning

We present StyleBabel, a unique open access dataset of natural language ...
research
05/02/2022

Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation

Multi-modal Machine Translation (MMT) enables the use of visual informat...
research
04/03/2018

Correlated discrete data generation using adversarial training

Generative Adversarial Networks (GAN) have shown great promise in tasks ...

Please sign up or login with your details

Forgot password? Click here to reset