Self-Attention Based Generative Adversarial Networks For Unsupervised Video Summarization

07/16/2023
by   Maria Nektaria Minaidi, et al.
0

In this paper, we study the problem of producing a comprehensive video summary following an unsupervised approach that relies on adversarial learning. We build on a popular method where a Generative Adversarial Network (GAN) is trained to create representative summaries, indistinguishable from the originals. The introduction of the attention mechanism into the architecture for the selection, encoding and decoding of video frames, shows the efficacy of self-attention and transformer in modeling temporal relationships for video summarization. We propose the SUM-GAN-AED model that uses a self-attention mechanism for frame selection, combined with LSTMs for encoding and decoding. We evaluate the performance of the SUM-GAN-AED model on the SumMe, TVSum and COGNIMUSE datasets. Experimental results indicate that using a self-attention mechanism as the frame selection mechanism outperforms the state-of-the-art on SumMe and leads to comparable to state-of-the-art performance on TVSum and COGNIMUSE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2022

Exploring Global Diversity and Local Context for Video Summarization

Video summarization aims to automatically generate a diverse and concise...
research
04/17/2019

Cycle-SUM: Cycle-consistent Adversarial LSTM Networks for Unsupervised Video Summarization

In this paper, we present a novel unsupervised video summarization model...
research
11/20/2020

SalSum: Saliency-based Video Summarization using Generative Adversarial Networks

The huge amount of video data produced daily by camera-based systems, su...
research
12/05/2018

Summarizing Videos with Attention

In this work we propose a novel method for supervised, keyshots based vi...
research
09/27/2022

A comparative study of attention mechanism and generative adversarial network in facade damage segmentation

Semantic segmentation profits from deep learning and has shown its possi...
research
09/23/2020

Exploring global diverse attention via pairwise temporal relation for video summarization

Video summarization is an effective way to facilitate video searching an...
research
08/26/2021

Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?

In this paper we address the problem of fine-tuned text generation with ...

Please sign up or login with your details

Forgot password? Click here to reset