DeepAI
Log In Sign Up

Regional Attention Network (RAN) for Head Pose and Fine-grained Gesture Recognition

01/17/2021
by   Ardhendu Behera, et al.
2

Affect is often expressed via non-verbal body language such as actions/gestures, which are vital indicators for human behaviors. Recent studies on recognition of fine-grained actions/gestures in monocular images have mainly focused on modeling spatial configuration of body parts representing body pose, human-objects interactions and variations in local appearance. The results show that this is a brittle approach since it relies on accurate body parts/objects detection. In this work, we argue that there exist local discriminative semantic regions, whose "informativeness" can be evaluated by the attention mechanism for inferring fine-grained gestures/actions. To this end, we propose a novel end-to-end Regional Attention Network (RAN), which is a fully Convolutional Neural Network (CNN) to combine multiple contextual regions through attention mechanism, focusing on parts of the images that are most relevant to a given task. Our regions consist of one or more consecutive cells and are adapted from the strategies used in computing HOG (Histogram of Oriented Gradient) descriptor. The model is extensively evaluated on ten datasets belonging to 3 different scenarios: 1) head pose recognition, 2) drivers state recognition, and 3) human action and facial expression recognition. The proposed approach outperforms the state-of-the-art by a considerable margin in different metrics.

READ FULL TEXT

page 4

page 9

page 11

page 12

page 14

10/23/2021

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

This paper presents a novel keypoints-based attention mechanism for visu...
09/05/2022

SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization

Over the past few years, a significant progress has been made in deep co...
01/17/2021

Coarse Temporal Attention Network (CTA-Net) for Driver's Activity Recognition

There is significant progress in recognizing traditional human activitie...
05/17/2021

A Fine-Grained Visual Attention Approach for Fingerspelling Recognition in the Wild

Fingerspelling in sign language has been the means of communicating tech...
05/24/2019

Pose-adaptive Hierarchical Attention Network for Facial Expression Recognition

Multi-view facial expression recognition (FER) is a challenging task bec...
05/11/2019

Follow the Attention: Combining Partial Pose and Object Motion for Fine-Grained Action Detection

Activity recognition in shopping environments is an important and challe...
05/15/2021

One for All: An End-to-End Compact Solution for Hand Gesture Recognition

The HGR is a quite challenging task as its performance is influenced by ...