Contextual Action Recognition with R*CNN

05/05/2015
by   Georgia Gkioxari, et al.
0

There are multiple cues in an image which reveal what action a person is performing. For example, a jogger has a pose that is characteristic for jogging, but the scene (e.g. road, trail) and the presence of other joggers can be an additional source of information. In this work, we exploit the simple observation that actions are accompanied by contextual cues to build a strong action recognition system. We adapt RCNN to use more than one region for classification while still maintaining the ability to localize the action. We call our system R*CNN. The action-specific models and the feature maps are trained jointly, allowing for action specific representations to emerge. R*CNN achieves 90.2 approaches in the field by a significant margin. Last, we show that R*CNN is not limited to action recognition. In particular, R*CNN can also be used to tackle fine-grained tasks such as attribute classification. We validate this claim by reporting state-of-the-art performance on the Berkeley Attributes of People dataset.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 9

research
04/03/2023

On the Benefits of 3D Pose and Tracking for Human Action Recognition

In this work we study the benefits of using tracking and 3D poses for ac...
research
10/01/2019

Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction

For effectively interacting with humans in collaborative environments, m...
research
08/10/2016

DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns

The recognition of human actions and the determination of human attribut...
research
12/10/2019

HalluciNet-ing Spatiotemporal Representations Using 2D-CNN

Spatiotemporal representations learnt using 3D convolutional neural netw...
research
08/03/2022

Combined CNN Transformer Encoder for Enhanced Fine-grained Human Action Recognition

Fine-grained action recognition is a challenging task in computer vision...
research
04/24/2017

An Analysis of Action Recognition Datasets for Language and Vision Tasks

A large amount of recent research has focused on tasks that combine lang...
research
07/22/2019

Domain-Specific Priors and Meta Learning for Low-shot First-Person Action Recognition

The lack of large-scale real datasets with annotationsmakes transfer lea...

Please sign up or login with your details

Forgot password? Click here to reset