Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition

08/15/2016
by   Yi Zhu, et al.
0

This paper performs the first investigation into depth for large-scale human action recognition in video where the depth cues are estimated from the videos themselves. We develop a new framework called depth2action and experiment thoroughly into how best to incorporate the depth information. We introduce spatio-temporal depth normalization (STDN) to enforce temporal consistency in our estimated depth sequences. We also propose modified depth motion maps (MDMM) to capture the subtle temporal changes in depth. These two components significantly improve the action recognition performance. We evaluate our depth2action framework on three large-scale action recognition video benchmarks. Our model achieves state-of-the-art performance when combined with appearance and motion information thus demonstrating that depth2action is indeed complementary to existing approaches.

READ FULL TEXT

page 2

page 3

page 5

page 11

page 12

research
10/22/2020

Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition

In recent years, a number of approaches based on 2D CNNs and 3D CNNs hav...
research
01/19/2021

Human Action Recognition Based on Multi-scale Feature Maps from Depth Video Sequences

Human action recognition is an active research area in computer vision. ...
research
06/06/2015

First-Take-All: Temporal Order-Preserving Hashing for 3D Action Videos

With the prevalence of the commodity depth cameras, the new paradigm of ...
research
09/05/2023

EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding

With the surge in attention to Egocentric Hand-Object Interaction (Ego-H...
research
01/20/2015

Deep Convolutional Neural Networks for Action Recognition Using Depth Map Sequences

Recently, deep learning approach has achieved promising results in vario...
research
05/12/2020

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

To facilitate depth-based 3D action recognition, 3D dynamic voxel (3DV) ...
research
09/08/2019

Multi-Modal Three-Stream Network for Action Recognition

Human action recognition in video is an active yet challenging research ...

Please sign up or login with your details

Forgot password? Click here to reset