Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks

05/18/2017
by   Zhuolin Jiang, et al.
0

Infrared (IR) imaging has the potential to enable more robust action recognition systems compared to visible spectrum cameras due to lower sensitivity to lighting conditions and appearance variability. While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in IR videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network (CNN) architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR image and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. To our best knowledge, this is the first application of the 3D CNN to action recognition in the IR domain. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision (AP) performances on the InfAR dataset: (1) the proposed two-stream 3D CNN achieves the best reported 77.5 3D CNN model applied to the optical flow fields achieves the best reported single stream 75.42

READ FULL TEXT

page 5

page 7

page 8

page 11

page 12

page 13

page 14

research
12/09/2016

ActionFlowNet: Learning Motion Representation for Action Recognition

Even with the recent advances in convolutional neural networks (CNN) in ...
research
04/02/2017

Hidden Two-Stream Convolutional Networks for Action Recognition

Analyzing videos of human actions involves understanding the temporal re...
research
03/11/2019

Investigation on Combining 3D Convolution of Image Data and Optical Flow to Generate Temporal Action Proposals

In this paper, a novel two-stream architecture for the task of temporal ...
research
06/05/2022

3D Convolutional with Attention for Action Recognition

Human action recognition is one of the challenging tasks in computer vis...
research
10/02/2018

Representation Flow for Action Recognition

In this paper, we propose a convolutional layer inspired by optical flow...
research
08/10/2020

2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors Challenges: An Efficient Optical Flow Stream Guided Framework

To address the problem of training on small datasets for action recognit...
research
07/15/2020

Temporal Distinct Representation Learning for Action Recognition

Motivated by the previous success of Two-Dimensional Convolutional Neura...

Please sign up or login with your details

Forgot password? Click here to reset