HAMLET: A Hierarchical Multimodal Attention-based Human Activity Recognition Algorithm

08/03/2020
by   Md Mofijul Islam, et al.
0

To fluently collaborate with people, robots need the ability to recognize human activities accurately. Although modern robots are equipped with various sensors, robust human activity recognition (HAR) still remains a challenging task for robots due to difficulties related to multimodal data fusion. To address these challenges, in this work, we introduce a deep neural network-based multimodal HAR algorithm, HAMLET. HAMLET incorporates a hierarchical architecture, where the lower layer encodes spatio-temporal features from unimodal data by adopting a multi-head self-attention mechanism. We develop a novel multimodal attention mechanism for disentangling and fusing the salient unimodal features to compute the multimodal features in the upper layer. Finally, multimodal features are used in a fully connect neural-network to recognize human activities. We evaluated our algorithm by comparing its performance to several state-of-the-art activity recognition algorithms on three human activity datasets. The results suggest that HAMLET outperformed all other evaluated baselines across all datasets and metrics tested, with the highest top-1 accuracy of 95.12 UT-Kinect [2] datasets respectively, and F1-score of 81.52 dataset. We further visualize the unimodal and multimodal attention maps, which provide us with a tool to interpret the impact of attention mechanisms concerning HAR.

READ FULL TEXT

page 1

page 7

research
10/14/2022

MMTSA: Multimodal Temporal Segment Attention Network for Efficient Human Activity Recognition

Multimodal sensors (e.g., visual, non-visual, and wearable) provide comp...
research
03/08/2023

Robust Multimodal Fusion for Human Activity Recognition

The proliferation of IoT and mobile devices equipped with heterogeneous ...
research
03/07/2021

Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition

Wearable sensor based human activity recognition is a challenging proble...
research
08/15/2022

Self-Supervised Multimodal Fusion Transformer for Passive Activity Recognition

The pervasiveness of Wi-Fi signals provides significant opportunities fo...
research
05/17/2018

Interpretable Parallel Recurrent Neural Networks with Convolutional Attentions for Multi-Modality Activity Modeling

Multimodal features play a key role in wearable sensor-based human activ...
research
11/21/2017

Fullie and Wiselie: A Dual-Stream Recurrent Convolutional Attention Model for Activity Recognition

Multimodal features play a key role in wearable sensor based Human Activ...
research
01/26/2020

Multimodal Data Fusion based on the Global Workspace Theory

We propose a novel neural network architecture, named the Global Workspa...

Please sign up or login with your details

Forgot password? Click here to reset