Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

08/09/2022
by   Hayat Ullah, et al.
1

Vision-based human activity recognition has emerged as one of the essential research areas in video analytics domain. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the human activity recognition task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency and robustness, resulting in a biased tradeoff in their proposals to deal with challenging human activity recognition problem. To overcome the limitations of contemporary deep learning models for human activity recognition, this paper presents a computationally efficient yet generic spatial-temporal cascaded framework that exploits the deep discriminative spatial and temporal features for human activity recognition. For efficient representation of human actions, we have proposed an efficient dual attentional convolutional neural network (CNN) architecture that leverages a unified channel-spatial attention mechanism to extract human-centric salient features in video frames. The dual channel-spatial attention layers together with the convolutional layers learn to be more attentive in the spatial receptive fields having objects over the number of feature maps. The extracted discriminative salient features are then forwarded to stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning. Extensive experiments are conducted, where the obtained results show that the proposed framework attains an improvement in execution time up to 167 times in terms of frames per second as compared to most of the contemporary action recognition methods.

READ FULL TEXT

page 1

page 5

page 7

page 11

research
06/25/2020

DanHAR: Dual Attention Network For Multimodal Human Activity Recognition Using Wearable Sensors

Human activity recognition (HAR) in ubiquitous computing has been beginn...
research
07/27/2021

Real-Time Activity Recognition and Intention Recognition Using a Vision-based Embedded System

With the rapid increase in digital technologies, most fields of study in...
research
02/21/2021

Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM

Automatically detecting violence from surveillance footage is a subset o...
research
01/17/2021

Human Activity Recognition Using Multichannel Convolutional Neural Network

Human Activity Recognition (HAR) simply refers to the capacity of a mach...
research
11/21/2017

Fullie and Wiselie: A Dual-Stream Recurrent Convolutional Attention Model for Activity Recognition

Multimodal features play a key role in wearable sensor based Human Activ...
research
09/29/2017

Impact of Three-Dimensional Video Scalability on Multi-View Activity Recognition using Deep Learning

Human activity recognition is one of the important research topics in co...
research
11/10/2022

SWTF: Sparse Weighted Temporal Fusion for Drone-Based Activity Recognition

Drone-camera based human activity recognition (HAR) has received signifi...

Please sign up or login with your details

Forgot password? Click here to reset