Deep Unsupervised Key Frame Extraction for Efficient Video Classification

11/12/2022
by   Hao Tang, et al.
0

Video processing and analysis have become an urgent task since a huge amount of videos (e.g., Youtube, Hulu) are uploaded online every day. The extraction of representative key frames from videos is very important in video processing and analysis since it greatly reduces computing resources and time. Although great progress has been made recently, large-scale video classification remains an open problem, as the existing methods have not well balanced the performance and efficiency simultaneously. To tackle this problem, this work presents an unsupervised method to retrieve the key frames, which combines Convolutional Neural Network (CNN) and Temporal Segment Density Peaks Clustering (TSDPC). The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically. The other is that it can preserve the temporal information of the video. Thus it improves the efficiency of video classification. Furthermore, a Long Short-Term Memory network (LSTM) is added on the top of the CNN to further elevate the performance of classification. Moreover, a weight fusion strategy of different input networks is presented to boost the performance. By optimizing both video classification and key frame extraction simultaneously, we achieve better classification performance and higher efficiency. We evaluate our method on two popular datasets (i.e., HMDB51 and UCF101) and the experimental results consistently demonstrate that our strategy achieves competitive performance and efficiency compared with the state-of-the-art approaches.

READ FULL TEXT
research
05/22/2015

Efficient Large Scale Video Classification

Video classification has advanced tremendously over the recent years. A ...
research
02/06/2020

An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

We propose a novel scheme for human action recognition in videos, using ...
research
01/15/2019

Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

Gesture recognition is a hot topic in computer vision and pattern recogn...
research
04/08/2015

Evaluating Two-Stream CNN for Video Classification

Videos contain very rich semantic information. Traditional hand-crafted ...
research
03/13/2020

Dual Temporal Memory Network for Efficient Video Object Segmentation

Video Object Segmentation (VOS) is typically formulated in a semi-superv...
research
04/24/2018

ECO: Efficient Convolutional Network for Online Video Understanding

The state of the art in video understanding suffers from two problems: (...
research
06/14/2017

Large-Scale YouTube-8M Video Understanding with Deep Neural Networks

Video classification problem has been studied many years. The success of...

Please sign up or login with your details

Forgot password? Click here to reset