Loss Guided Activation for Action Recognition in Still Images

12/11/2018
by   Lu Liu, et al.
24

One significant problem of deep-learning based human action recognition is that it can be easily misled by the presence of irrelevant objects or backgrounds. Existing methods commonly address this problem by employing bounding boxes on the target humans as part of the input, in both training and testing stages. This requirement of bounding boxes as part of the input is needed to enable the methods to ignore irrelevant contexts and extract only human features. However, we consider this solution is inefficient, since the bounding boxes might not be available. Hence, instead of using a person bounding box as an input, we introduce a human-mask loss to automatically guide the activations of the feature maps to the target human who is performing the action, and hence suppress the activations of misleading contexts. We propose a multi-task deep learning method that jointly predicts the human action class and human location heatmap. Extensive experiments demonstrate our approach is more robust compared to the baseline methods under the presence of irrelevant misleading contexts. Our method achieves 94.06% and 40.65% (in terms of mAP) on Stanford40 and MPII dataset respectively, which are 3.14% and 12.6% relative improvements over the best results reported in the literature, and thus set new state-of-the-art results. Additionally, unlike some existing methods, we eliminate the requirement of using a person bounding box as an input during testing.

READ FULL TEXT

page 2

page 6

page 12

page 13

page 14

page 15

page 16

research
12/14/2018

Action Machine: Rethinking Action Recognition in Trimmed Videos

Existing methods in video action recognition mostly do not distinguish h...
research
07/25/2022

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

Text detection and recognition are essential components of a modern OCR ...
research
05/19/2020

Localizing Firearm Carriers by Identifying Human-Object Pairs

Visual identification of gunmen in a crowd is a challenging problem, tha...
research
11/29/2020

A Boundary Regressing Model for Nested Named Entity Recognition

Recognizing named entities (NEs) is commonly conducted as a classificati...
research
08/12/2023

Fusion-GRU: A Deep Learning Model for Future Bounding Box Prediction of Traffic Agents in Risky Driving Videos

To ensure the safe and efficient navigation of autonomous vehicles and a...
research
09/10/2019

Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning

Purpose: Hip fractures are a common cause of morbidity and mortality. Au...
research
03/24/2022

Occluded Human Mesh Recovery

Top-down methods for monocular human mesh recovery have two stages: (1) ...

Please sign up or login with your details

Forgot password? Click here to reset