NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

05/12/2019
by   Jun Liu, et al.
12

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding. [The dataset is available at: http://rose1.ntu.edu.sg/Datasets/actionRecognition.asp]

READ FULL TEXT

page 2

page 3

page 5

page 6

page 12

page 13

page 14

page 16

research
04/11/2016

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Recent approaches in depth-based human activity analysis achieved outsta...
research
03/22/2017

PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding

Despite the fact that many 3D human activity benchmarks being proposed, ...
research
09/21/2022

FT-HID: A Large Scale RGB-D Dataset for First and Third Person Human Interaction Analysis

Analysis of human interaction is one important research topic of human m...
research
10/28/2020

ElderSim: A Synthetic Data Generation Platform for Human Action Recognition in Eldercare Applications

To train deep learning models for vision-based action recognition of eld...
research
03/04/2020

ETRI-Activity3D: A Large-Scale RGB-D Dataset for Robots to Recognize Daily Activities of the Elderly

Deep learning, based on which many modern algorithms operate, is well kn...
research
05/13/2020

RISE Video Dataset: Recognizing Industrial Smoke Emissions

Industrial smoke emissions pose a significant concern to human health. P...
research
12/03/2018

MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language

Computer Vision has been improved significantly in the past few decades....

Please sign up or login with your details

Forgot password? Click here to reset