Multi-modal Summarization for Video-containing Documents

09/17/2020
by   Xiyan Fu, et al.
0

Summarization of multimedia data becomes increasingly significant as it is the basis for many real-world applications, such as question answering, Web search, and so forth. Most existing multi-modal summarization works however have used visual complementary features extracted from images rather than videos, thereby losing abundant information. Hence, we propose a novel multi-modal summarization task to summarize from a document and its associated video. In this work, we also build a baseline general model with effective strategies, i.e., bi-hop attention and improved late fusion mechanisms to bridge the gap between different modalities, and a bi-stream summarization strategy to employ text and video summarization simultaneously. Comprehensive experiments show that the proposed model is beneficial for multi-modal summarization and superior to existing methods. Moreover, we collect a novel dataset and it provides a new resource for future study that results from documents and videos.

READ FULL TEXT
research
04/26/2021

GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization

Traditional video summarization methods generate fixed video representat...
research
10/11/2019

Multi-modal Deep Analysis for Multimedia

With the rapid development of Internet and multimedia services in the pa...
research
06/27/2021

Multi-Modal Chorus Recognition for Improving Song Search

We discuss a novel task, Chorus Recognition, which could potentially ben...
research
05/19/2023

A Topic-aware Summarization Framework with Different Modal Side Information

Automatic summarization plays an important role in the exponential docum...
research
09/11/2021

A Survey on Multi-modal Summarization

The new era of technology has brought us to the point where it is conven...
research
09/21/2019

Video Skimming: Taxonomy and Comprehensive Survey

Video skimming, also known as dynamic video summarization, generates a t...
research
05/19/2020

Multi-Modal Summary Generation using Multi-Objective Optimization

Significant development of communication technology over the past few ye...

Please sign up or login with your details

Forgot password? Click here to reset