Universal-to-Specific Framework for Complex Action Recognition

07/13/2020
by   Peisen Zhao, et al.
0

Video-based action recognition has recently attracted much attention in the field of computer vision. To solve more complex recognition tasks, it has become necessary to distinguish different levels of interclass variations. Inspired by a common flowchart based on the human decision-making process that first narrows down the probable classes and then applies a "rethinking" process for finer-level recognition, we propose an effective universal-to-specific (U2S) framework for complex action recognition. The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network. The universal network first learns universal feature representations. The mask network then generates attention masks for confusing classes through category regularization based on the output of the universal network. The mask is further used to guide the category-specific network for class-specific feature representations. The entire framework is optimized in an end-to-end manner. Experiments on a variety of benchmark datasets, e.g., the Something-Something, UCF101, and HMDB51 datasets, demonstrate the effectiveness of the U2S framework; i.e., U2S can focus on discriminative spatiotemporal regions for confusing categories. We further visualize the relationship between different classes, showing that U2S indeed improves the discriminability of learned features. Moreover, the proposed U2S model is a general framework and may adopt any base recognition network.

READ FULL TEXT

page 1

page 9

page 10

page 11

page 13

research
04/15/2022

Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

In this paper, we propose a multi-domain learning model for action recog...
research
07/30/2015

Action recognition in still images by latent superpixel classification

Action recognition from still images is an important task of computer vi...
research
07/13/2023

Free-Form Composition Networks for Egocentric Action Recognition

Egocentric action recognition is gaining significant attention in the fi...
research
01/17/2016

Face-space Action Recognition by Face-Object Interactions

Action recognition in still images has seen major improvement in recent ...
research
01/15/2023

CMAE-V: Contrastive Masked Autoencoders for Video Action Recognition

Contrastive Masked Autoencoder (CMAE), as a new self-supervised framewor...
research
03/19/2018

Featureless: Bypassing feature extraction in action categorization

This method introduces an efficient manner of learning action categories...
research
05/20/2020

Discriminative Dictionary Design for Action Classification in Still Images and Videos

In this paper, we address the problem of action recognition from still i...

Please sign up or login with your details

Forgot password? Click here to reset