Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN

by   Cheng-Bin Jin, et al.

When we say a person is texting, can you tell the person is walking or sitting? Emphatically, no. In order to solve this incomplete representation problem, this paper presents a sub-action descriptor for detailed action detection. The sub-action descriptor consists of three levels: the posture, the locomotion, and the gesture level. The three levels give three sub-action categories for one action to address the representation problem. The proposed action detection model simultaneously localizes and recognizes the actions of multiple individuals in video surveillance using appearance-based temporal features with multi-CNN. The proposed approach achieved a mean average precision (mAP) of 76.6 measurement on the new large-scale ICVL video surveillance dataset that the authors introduce and make available to the community with this paper. Extensive experiments on the benchmark KTH dataset demonstrate that the proposed approach achieved better performance, which in turn boosts the action recognition performance over the state-of-the-art. The action detection model can run at around 25 fps on the ICVL and more than 80 fps on the KTH dataset, which is suitable for real-time surveillance applications.


page 3

page 7

page 9

page 10

page 16

page 18

page 21

page 22


Weakly-Supervised Multi-Person Action Recognition in 360^∘ Videos

The recent development of commodity 360^∘ cameras have enabled a single ...

P-CNN: Pose-based CNN Features for Action Recognition

This work targets human action recognition in video. While recent method...

A Real-time Action Representation with Temporal Encoding and Deep Compression

Deep neural networks have achieved remarkable success for video-based ac...

Depth-Aware Action Recognition: Pose-Motion Encoding through Temporal Heatmaps

Most state-of-the-art methods for action recognition rely only on 2D spa...

Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets

Sliding window is one direct way to extend a successful recognition syst...

Suspicious Behavior Detection on Shoplifting Cases for Crime Prevention by Using 3D Convolutional Neural Networks

Crime generates significant losses, both human and economic. Every year,...

vireoJD-MM at Activity Detection in Extended Videos

This notebook paper presents an overview and comparative analysis of our...

Please sign up or login with your details

Forgot password? Click here to reset