Modelling Temporal Information Using Discrete Fourier Transform for Video Classification

03/20/2016
by   Haimin Zhang, et al.
0

Recently, video classification attracts intensive research efforts. However, most existing works are based on framelevel visual features, which might fail to model the temporal information, e.g. characteristics accumulated along time. In order to capture video temporal information, we propose to analyse features in frequency domain transformed by discrete Fourier transform (DFT features). Frame-level features are firstly extract by a pre-trained deep convolutional neural network (CNN). Then, time domain features are transformed and interpolated into DFT features. CNN and DFT features are further encoded by using different pooling methods and fused for video classification. In this way, static image features extracted from a pre-trained deep CNN and temporal information represented by DFT features are jointly considered for video classification. We test our method for video emotion classification and action recognition. Experimental results demonstrate that combining DFT features can effectively capture temporal information and therefore improve the performance of both video emotion classification and action recognition. Our approach has achieved a state-of-the-art performance on the largest video emotion dataset (VideoEmotion-8 dataset) and competitive results on UCF-101.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2016

Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

With the widespread of user-generated Internet videos, emotion recogniti...
research
07/13/2021

Developmental Stage Classification of Embryos Using Two-Stream Neural Network with Linear-Chain Conditional Random Field

The developmental process of embryos follows a monotonic order. An embry...
research
01/31/2016

Order-aware Convolutional Pooling for Video Based Action Recognition

Most video based action recognition approaches create the video-level re...
research
03/15/2020

Energy-based Periodicity Mining with Deep Features for Action Repetition Counting in Unconstrained Videos

Action repetition counting is to estimate the occurrence times of the re...
research
12/10/2018

SlowFast Networks for Video Recognition

We present SlowFast networks for video recognition. Our model involves (...
research
11/08/2019

Extracting temporal features into a spatial domain using autoencoders for sperm video analysis

In this paper, we present a two-step deep learning method that is used t...
research
05/12/2017

Single Image Action Recognition by Predicting Space-Time Saliency

We propose a novel approach based on deep Convolutional Neural Networks ...

Please sign up or login with your details

Forgot password? Click here to reset