Reconstructive Sequence-Graph Network for Video Summarization

05/10/2021
by   Bin Zhao, et al.
0

Exploiting the inner-shot and inter-shot dependencies is essential for key-shot based video summarization. Current approaches mainly devote to modeling the video as a frame sequence by recurrent neural networks. However, one potential limitation of the sequence models is that they focus on capturing local neighborhood dependencies while the high-order dependencies in long distance are not fully exploited. In general, the frames in each shot record a certain activity and vary smoothly over time, but the multi-hop relationships occur frequently among shots. In this case, both the local and global dependencies are important for understanding the video content. Motivated by this point, we propose a Reconstructive Sequence-Graph Network (RSGN) to encode the frames and shots as sequence and graph hierarchically, where the frame-level dependencies are encoded by Long Short-Term Memory (LSTM), and the shot-level dependencies are captured by the Graph Convolutional Network (GCN). Then, the videos are summarized by exploiting both the local and global dependencies among shots. Besides, a reconstructor is developed to reward the summary generator, so that the generator can be optimized in an unsupervised manner, which can avert the lack of annotated data in video summarization. Furthermore, under the guidance of reconstruction loss, the predicted summary can better preserve the main video content and shot-level dependencies. Practically, the experimental results on three popular datasets i.e., SumMe, TVsum and VTW) have demonstrated the superiority of our proposed approach to the summarization task.

READ FULL TEXT

page 1

page 3

page 7

research
09/22/2021

Hierarchical Multimodal Transformer to Summarize Videos

Although video summarization has achieved tremendous success benefiting ...
research
04/18/2022

MHSCNet: A Multimodal Hierarchical Shot-aware Convolutional Network for Video Summarization

Video summarization intends to produce a concise video summary by effect...
research
05/26/2016

Video Summarization with Long Short-term Memory

We propose a novel supervised learning technique for summarizing videos ...
research
05/17/2021

AudioVisual Video Summarization

Audio and vision are two main modalities in video data. Multimodal learn...
research
04/30/2018

DTR-GAN: Dilated Temporal Relational Adversarial Network for Video Summarization

The large amount of videos popping up every day, make it is more and mor...
research
12/27/2021

Video Joint Modelling Based on Hierarchical Transformer for Co-summarization

Video summarization aims to automatically generate a summary (storyboard...
research
07/17/2020

SumGraph: Video Summarization via Recursive Graph Modeling

The goal of video summarization is to select keyframes that are visually...

Please sign up or login with your details

Forgot password? Click here to reset