Query Twice: Dual Mixture Attention Meta Learning for Video Summarization

08/19/2020
by   Junyan Wang, et al.
0

Video summarization aims to select representative frames to retain high-level information, which is usually solved by predicting the segment-wise importance score via a softmax function. However, softmax function suffers in retaining high-rank representations for complex visual or sequential information, which is known as the Softmax Bottleneck problem. In this paper, we propose a novel framework named Dual Mixture Attention (DMASum) model with Meta Learning for video summarization that tackles the softmax bottleneck problem, where the Mixture of Attention layer (MoA) effectively increases the model capacity by employing twice self-query attention that can capture the second-order changes in addition to the initial query-key attention, and a novel Single Frame Meta Learning rule is then introduced to achieve more generalization to small datasets with limited training sources. Furthermore, the DMASum significantly exploits both visual and sequential attention that connects local key-frame and global attention in an accumulative way. We adopt the new evaluation protocol on two public datasets, SumMe, and TVSum. Both qualitative and quantitative experiments manifest significant improvements over the state-of-the-art methods.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

07/29/2019

Meta Learning for Task-Driven Video Summarization

Existing video summarization approaches mainly concentrate on sequential...
05/28/2018

Sigsoftmax: Reanalysis of the Softmax Bottleneck

Softmax is an output activation function for modeling categorical probab...
05/01/2017

Query-adaptive Video Summarization via Quality-aware Relevance Estimation

Although the problem of automatic video summarization has recently recei...
07/08/2021

Use of Affective Visual Information for Summarization of Human-Centric Videos

Increasing volume of user-generated human-centric video content and thei...
01/27/2022

Exploring Global Diversity and Local Context for Video Summarization

Video summarization aims to automatically generate a diverse and concise...
12/04/2018

Meta Learning Deep Visual Words for Fast Video Object Segmentation

Meta learning has attracted a lot of attention recently. In this paper, ...
10/11/2021

Breaking the Softmax Bottleneck for Sequential Recommender Systems with Dropout and Decoupling

The Softmax bottleneck was first identified in language modeling as a th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.