ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation

02/26/2021
by   Yu Liu, et al.
0

Interpreting human actions requires understanding the spatial and temporal context of the scenes. State-of-the-art action detectors based on Convolutional Neural Network (CNN) have demonstrated remarkable results by adopting two-stream or 3D CNN architectures. However, these methods typically operate in a non-real-time, ofline fashion due to system complexity to reason spatio-temporal information. Consequently, their high computational cost is not compliant with emerging real-world scenarios such as service robots or public surveillance where detection needs to take place at resource-limited edge devices. In this paper, we propose ACDnet, a compact action detection network targeting real-time edge computing which addresses both efficiency and accuracy. It intelligently exploits the temporal coherence between successive video frames to approximate their CNN features rather than naively extracting them. It also integrates memory feature aggregation from past video frames to enhance current detection stability, implicitly modeling long temporal cues over time. Experiments conducted on the public benchmark datasets UCF-24 and JHMDB-21 demonstrate that ACDnet, when integrated with the SSD detector, can robustly achieve detection well above real-time (75 FPS). At the same time, it retains reasonable accuracy (70.92 and 49.53 frame mAP) compared to other top-performing methods using far heavier configurations. Codes will be available at https://github.com/dginhac/ACDnet.

READ FULL TEXT

page 4

page 6

page 7

research
03/30/2017

Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos

Deep learning has been demonstrated to achieve excellent results for ima...
research
11/17/2021

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

Timely handgun detection is a crucial problem to improve public safety; ...
research
10/28/2016

Real-time Online Action Detection Forests using Spatio-temporal Contexts

Online action detection (OAD) is challenging since 1) robust yet computa...
research
10/17/2020

Efficient and Compact Convolutional Neural Network Architectures for Non-temporal Real-time Fire Detection

Automatic visual fire detection is used to complement traditional fire d...
research
09/10/2018

A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications

Detecting faces and heads appearing in video feeds are challenging tasks...
research
08/30/2023

Two-Stage Violence Detection Using ViTPose and Classification Models at Smart Airports

This study introduces an innovative violence detection framework tailore...
research
05/14/2022

ETAD: A Unified Framework for Efficient Temporal Action Detection

Untrimmed video understanding such as temporal action detection (TAD) of...

Please sign up or login with your details

Forgot password? Click here to reset