SWTF: Sparse Weighted Temporal Fusion for Drone-Based Activity Recognition

11/10/2022
by   Santosh Kumar Yadav, et al.
0

Drone-camera based human activity recognition (HAR) has received significant attention from the computer vision research community in the past few years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Fusion (SWTF) module to utilize sparsely sampled video frames for obtaining global weighted temporal fusion outcome. The proposed SWTF is divided into two components. First, a temporal segment network that sparsely samples a given set of frames. Second, weighted temporal fusion, that incorporates a fusion of feature maps derived from optical flow, with raw RGB images. This is followed by base-network, which comprises a convolutional neural network module along with fully connected layers that provide us with activity recognition. The SWTF network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76 92.56 state-of-the-art performances by a significant margin.

READ FULL TEXT

page 2

page 3

page 4

research
12/07/2022

DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition

Human activity recognition (HAR) using drone-mounted cameras has attract...
research
08/22/2017

Activity Recognition based on a Magnitude-Orientation Stream Network

The temporal component of videos provides an important clue for activity...
research
08/09/2022

Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Vision-based human activity recognition has emerged as one of the essent...
research
03/21/2022

Fourier Disentangled Space-Time Attention for Aerial Video Recognition

We present an algorithm, Fourier Activity Recognition (FAR), for UAV vid...
research
02/20/2021

Efficient Multi-stream Temporal Learning and Post-fusion Strategy for 3D Skeleton-based Hand Activity Recognition

Recognizing first-person hand activity is a challenging task, especially...
research
09/02/2015

Manipulated Object Proposal: A Discriminative Object Extraction and Feature Fusion Framework for First-Person Daily Activity Recognition

Detecting and recognizing objects interacting with humans lie in the cen...

Please sign up or login with your details

Forgot password? Click here to reset