Use of Affective Visual Information for Summarization of Human-Centric Videos

07/08/2021
by   Berkay Köprü, et al.
0

Increasing volume of user-generated human-centric video content and their applications, such as video retrieval and browsing, require compact representations that are addressed by the video summarization literature. Current supervised studies formulate video summarization as a sequence-to-sequence learning problem and the existing solutions often neglect the surge of human-centric view, which inherently contains affective content. In this study, we investigate the affective-information enriched supervised video summarization task for human-centric videos. First, we train a visual input-driven state-of-the-art continuous emotion recognition model (CER-NET) on the RECOLA dataset to estimate emotional attributes. Then, we integrate the estimated emotional attributes and the high-level representations from the CER-NET with the visual information to define the proposed affective video summarization architectures (AVSUM). In addition, we investigate the use of attention to improve the AVSUM architectures and propose two new architectures based on temporal attention (TA-AVSUM) and spatial attention (SA-AVSUM). We conduct video summarization experiments on the TvSum database. The proposed AVSUM-GRU architecture with an early fusion of high level GRU embeddings and the temporal attention based TA-AVSUM architecture attain competitive video summarization performances by bringing strong performance improvements for the human-centric videos compared to the state-of-the-art in terms of F-score and self-defined face recall metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2022

Role of Audio in Audio-Visual Video Summarization

Video summarization attracts attention for efficient video representatio...
research
01/27/2021

Efficient Video Summarization Framework using EEG and Eye-tracking Signals

This paper proposes an efficient video summarization framework that will...
research
07/17/2017

Show and Recall: Learning What Makes Videos Memorable

With the explosion of video content on the Internet, there is a need for...
research
10/18/2022

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

In recent years, deep neural networks have demonstrated increasingly str...
research
08/19/2020

Query Twice: Dual Mixture Attention Meta Learning for Video Summarization

Video summarization aims to select representative frames to retain high-...
research
12/21/2018

A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization

Emotional content is a crucial ingredient in user-generated videos. Howe...
research
07/04/2023

Query-based Video Summarization with Pseudo Label Supervision

Existing datasets for manually labelled query-based video summarization ...

Please sign up or login with your details

Forgot password? Click here to reset