Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling

08/11/2020
by   Jiacheng Li, et al.
7

Visual Storytelling (VIST) is a task to tell a narrative story about a certain topic according to the given photo stream. The existing studies focus on designing complex models, which rely on a huge amount of human-annotated data. However, the annotation of VIST is extremely costly and many topics cannot be covered in the training dataset due to the long-tail topic distribution. In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting. Inspired by the way humans tell a story, we propose a topic adaptive storyteller to model the ability of inter-topic generalization. In practice, we apply the gradient-based meta-learning algorithm on multi-modal seq2seq models to endow the model the ability to adapt quickly from topic to topic. Besides, We further propose a prototype encoding structure to model the ability of intra-topic derivation. Specifically, we encode and restore the few training story text to serve as a reference to guide the generation at inference time. Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model on BLEU and METEOR metric. The further case study shows that the stories generated after few-shot adaptation are more relative and expressive.

READ FULL TEXT

Authors

page 2

page 3

page 8

11/07/2021

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Personalizing a speech synthesis system is a highly desired application,...
11/11/2019

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication

Visual storytelling aims to generate a narrative paragraph from a sequen...
12/14/2021

TopNet: Learning from Neural Topic Model to Generate Long Stories

Long story generation (LSG) is one of the coveted goals in natural langu...
05/30/2018

Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling

Visual storytelling includes two important parts: coherence between the ...
03/01/2021

Universal-Prototype Augmentation for Few-Shot Object Detection

Few-shot object detection (FSOD) aims to strengthen the performance of n...
08/01/2019

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

Image paragraph generation is the task of producing a coherent story (us...
09/01/2020

Bubble Storytelling with Automated Animation: A Brexit Hashtag Activism Case Study

Hashtag data are common and easy to acquire. Thus, they are widely used ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.