InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction

09/26/2022
by   Yinghao Huang, et al.
0

Humans constantly interact with daily objects to accomplish tasks. To understand such interactions, computers need to reconstruct these from cameras observing whole-body interaction with scenes. This is challenging due to occlusion between the body and objects, motion blur, depth/scale ambiguities, and the low image resolution of hands and graspable object parts. To make the problem tractable, the community focuses either on interacting hands, ignoring the body, or on interacting bodies, ignoring hands. The GRAB dataset addresses dexterous whole-body interaction but uses marker-based MoCap and lacks images, while BEHAVE captures video of body object interaction but lacks hand detail. We address the limitations of prior work with InterCap, a novel method that reconstructs interacting whole-bodies and objects from multi-view RGB-D data, using the parametric whole-body model SMPL-X and known object meshes. To tackle the above challenges, InterCap uses two key observations: (i) Contact between the hand and object can be used to improve the pose estimation of both. (ii) Azure Kinect sensors allow us to set up a simple multi-view RGB-D capture system that minimizes the effect of occlusion while providing reasonable inter-camera synchronization. With this method we capture the InterCap dataset, which contains 10 subjects (5 males and 5 females) interacting with 10 objects of various sizes and affordances, including contact with the hands or feet. In total, InterCap has 223 RGB-D videos, resulting in 67,357 multi-view frames, each containing 6 RGB-D images. Our method provides pseudo ground-truth body meshes and objects for each video frame. Our InterCap method and dataset fill an important gap in the literature and support many research directions. Our data and code are areavailable for research purposes.

READ FULL TEXT

page 2

page 8

page 10

research
08/25/2020

GRAB: A Dataset of Whole-Body Human Grasping of Objects

Training computers to understand, model, and synthesize human grasping r...
research
08/20/2019

Resolving 3D Human Pose Ambiguities with 3D Scene Constraints

To understand and analyze human behavior, we need to capture humans movi...
research
04/22/2021

H2O: Two Hands Manipulating Objects for First Person Interaction Recognition

We present, for the first time, a comprehensive framework for egocentric...
research
04/14/2022

BEHAVE: Dataset and Method for Tracking Human Object Interactions

Modelling interactions between humans and objects in natural environment...
research
12/15/2022

NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions

Humans constantly interact with objects in daily life tasks. Capturing s...
research
01/18/2023

HMDO: Markerless Multi-view Hand Manipulation Capture with Deformable Objects

We construct the first markerless deformable interaction dataset recordi...
research
04/09/2020

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Robots and other smart devices need efficient object-based scene represe...

Please sign up or login with your details

Forgot password? Click here to reset