Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling

08/11/2020
by   Jiacheng Li, et al.
7

Visual Storytelling (VIST) is a task to tell a narrative story about a certain topic according to the given photo stream. The existing studies focus on designing complex models, which rely on a huge amount of human-annotated data. However, the annotation of VIST is extremely costly and many topics cannot be covered in the training dataset due to the long-tail topic distribution. In this paper, we focus on enhancing the generalization ability of the VIST model by considering the few-shot setting. Inspired by the way humans tell a story, we propose a topic adaptive storyteller to model the ability of inter-topic generalization. In practice, we apply the gradient-based meta-learning algorithm on multi-modal seq2seq models to endow the model the ability to adapt quickly from topic to topic. Besides, We further propose a prototype encoding structure to model the ability of intra-topic derivation. Specifically, we encode and restore the few training story text to serve as a reference to guide the generation at inference time. Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model on BLEU and METEOR metric. The further case study shows that the stories generated after few-shot adaptation are more relative and expressive.

READ FULL TEXT

page 2

page 3

page 8

research
11/07/2021

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Personalizing a speech synthesis system is a highly desired application,...
research
11/11/2019

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication

Visual storytelling aims to generate a narrative paragraph from a sequen...
research
12/14/2021

TopNet: Learning from Neural Topic Model to Generate Long Stories

Long story generation (LSG) is one of the coveted goals in natural langu...
research
10/20/2020

Cue Me In: Content-Inducing Approaches to Interactive Story Generation

Automatically generating stories is a challenging problem that requires ...
research
03/01/2021

Universal-Prototype Augmentation for Few-Shot Object Detection

Few-shot object detection (FSOD) aims to strengthen the performance of n...
research
06/26/2023

ProtoDiff: Learning to Learn Prototypical Networks by Task-Guided Diffusion

Prototype-based meta-learning has emerged as a powerful technique for ad...

Please sign up or login with your details

Forgot password? Click here to reset