The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose

07/01/2020
by   Yizhak Ben-Shabat, et al.
0

The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks. In the context of understanding human activities, existing public datasets, while large in size, are often limited to a single RGB camera and provide only per-frame or per-clip action annotations. To enable richer analysis and understanding of human activities, we introduce IKEA ASM—a three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose. Additionally, we benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset. The dataset enables the development of holistic methods, which integrate multi-modal and multi-view data to better perform on these tasks.

READ FULL TEXT

page 2

page 6

page 7

research
08/24/2018

MVOR: A Multi-view RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation

Person detection and pose estimation is a key requirement to develop int...
research
08/01/2023

Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

3D human pose estimation in outdoor environments has garnered increasing...
research
05/25/2023

EgoHumans: An Egocentric 3D Multi-Human Benchmark

We present EgoHumans, a new multi-view multi-human video benchmark to ad...
research
03/28/2022

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

Assembly101 is a new procedural activity dataset featuring 4321 videos o...
research
07/31/2020

LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities

Understanding and interpreting human actions is a long-standing challeng...
research
08/21/2023

Multi-Modal Dataset Acquisition for Photometrically Challenging Object

This paper addresses the limitations of current datasets for 3D vision t...
research
06/26/2022

Szloca: towards a framework for full 3D tracking through a single camera in context of interactive arts

Realtime virtual data of objects and human presence in a large area hold...

Please sign up or login with your details

Forgot password? Click here to reset