Temporal Extension of Scale Pyramid and Spatial Pyramid Matching for Action Recognition

08/29/2014
by   Zhenzhong Lan, et al.
0

Historically, researchers in the field have spent a great deal of effort to create image representations that have scale invariance and retain spatial location information. This paper proposes to encode equivalent temporal characteristics in video representations for action recognition. To achieve temporal scale invariance, we develop a method called temporal scale pyramid (TSP). To encode temporal information, we present and compare two methods called temporal extension descriptor (TED) and temporal division pyramid (TDP) . Our purpose is to suggest solutions for matching complex actions that have large variation in velocity and appearance, which is missing from most current action representations. The experimental results on four benchmark datasets, UCF50, HMDB51, Hollywood2 and Olympic Sports, support our approach and significantly outperform state-of-the-art methods. Most noticeably, we achieve 65.0 and Hollywood2 datasets which constitutes an absolute improvement over the state-of-the-art by 7.8

READ FULL TEXT

page 3

page 5

research
10/15/2015

Beyond Spatial Pyramid Matching: Space-time Extended Descriptor for Action Recognition

We address the problem of generating video features for action recogniti...
research
03/28/2023

Rethinking matching-based few-shot action recognition

Few-shot action recognition, i.e. recognizing new action classes given o...
research
11/24/2014

Beyond Gaussian Pyramid: Multi-skip Feature Stacking for Action Recognition

Most state-of-the-art action feature extractors involve differential ope...
research
02/10/2021

AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition

Temporal modelling is the key for efficient video action recognition. Wh...
research
04/07/2020

Temporal Pyramid Network for Action Recognition

Visual tempo characterizes the dynamics and the temporal scale of an act...
research
08/27/2016

Spatio-temporal Aware Non-negative Component Representation for Action Recognition

This paper presents a novel mid-level representation for action recognit...
research
11/30/2020

Video Self-Stitching Graph Network for Temporal Action Localization

Temporal action localization (TAL) in videos is a challenging task, espe...

Please sign up or login with your details

Forgot password? Click here to reset