A Novel Two Stream Decision Level Fusion of Vision and Inertial Sensors Data for Automatic Multimodal Human Activity Recognition System

06/27/2023
by   Santosh Kumar Yadav, et al.
0

This paper presents a novel multimodal human activity recognition system. It uses a two-stream decision level fusion of vision and inertial sensors. In the first stream, raw RGB frames are passed to a part affinity field-based pose estimation network to detect the keypoints of the user. These keypoints are then pre-processed and inputted in a sliding window fashion to a specially designed convolutional neural network for the spatial feature extraction followed by regularized LSTMs to calculate the temporal features. The outputs of LSTM networks are then inputted to fully connected layers for classification. In the second stream, data obtained from inertial sensors are pre-processed and inputted to regularized LSTMs for the feature extraction followed by fully connected layers for the classification. At this stage, the SoftMax scores of two streams are then fused using the decision level fusion which gives the final prediction. Extensive experiments are conducted to evaluate the performance. Four multimodal standard benchmark datasets (UP-Fall detection, UTD-MHAD, Berkeley-MHAD, and C-MHAD) are used for experimentations. The accuracies obtained by the proposed system are 96.9 95.9 C-MHAD datasets. These results are far superior than the current state-of-the-art methods.

READ FULL TEXT
research
11/10/2022

SWTF: Sparse Weighted Temporal Fusion for Drone-Based Activity Recognition

Drone-camera based human activity recognition (HAR) has received signifi...
research
12/26/2018

A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition

In this work, we present a framework based on multi-stream convolutional...
research
12/07/2022

DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition

Human activity recognition (HAR) using drone-mounted cameras has attract...
research
03/08/2023

Robust Multimodal Fusion for Human Activity Recognition

The proliferation of IoT and mobile devices equipped with heterogeneous ...
research
04/04/2017

Two Stream LSTM: A Deep Fusion Framework for Human Action Recognition

In this paper we address the problem of human action recognition from vi...
research
02/06/2017

Concurrent Activity Recognition with Multimodal CNN-LSTM Structure

We introduce a system that recognizes concurrent activities from real-wo...
research
02/21/2021

Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM

Automatically detecting violence from surveillance footage is a subset o...

Please sign up or login with your details

Forgot password? Click here to reset