Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

03/29/2023
by   Xianghui Xie, et al.
0

Capturing the interactions between humans and their environment in 3D is important for many applications in robotics, graphics, and vision. Recent works to reconstruct the 3D human and object from a single RGB image do not have consistent relative translation across frames because they assume a fixed depth. Moreover, their performance drops significantly when the object is occluded. In this work, we propose a novel method to track the 3D human, object, contacts between them, and their relative translation across frames from a single RGB camera, while being robust to heavy occlusions. Our method is built on two key insights. First, we condition our neural field reconstructions for human and object on per-frame SMPL model estimates obtained by pre-fitting SMPL to a video sequence. This improves neural reconstruction accuracy and produces coherent relative translation across frames. Second, human and object motion from visible frames provides valuable information to infer the occluded object. We propose a novel transformer-based neural network that explicitly uses object visibility and human motion to leverage neighbouring frames to make predictions for the occluded frames. Building on these insights, our method is able to track both human and object robustly even under occlusions. Experiments on two datasets show that our method significantly improves over the state-of-the-art methods. Our code and pretrained models are available at: https://virtualhumans.mpi-inf.mpg.de/VisTracker

READ FULL TEXT

page 1

page 3

page 7

page 8

page 10

page 11

page 16

page 17

research
04/05/2022

CHORE: Contact, Human and Object REconstruction from a single RGB image

While most works in computer vision and learning have focused on perceiv...
research
10/30/2019

Motion-Nets: 6D Tracking of Unknown Objects in Unseen Environments using RGB

In this work, we bridge the gap between recent pose estimation and track...
research
09/19/2022

HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking

Robust and accurate planar tracking over a whole video sequence is vital...
research
09/30/2019

Track to Reconstruct and Reconstruct to Track

Object tracking and reconstruction are often performed together, with tr...
research
08/07/2019

A Robust Billboard-based Free-viewpoint Video Synthesizing Algorithm for Sports Scenes

We present a billboard-based free-viewpoint video synthesizing algorithm...
research
09/12/2022

Articulated 3D Human-Object Interactions from RGB Videos: An Empirical Analysis of Approaches and Challenges

Human-object interactions with articulated objects are common in everyda...
research
08/04/2022

Occupancy Planes for Single-view RGB-D Human Reconstruction

Single-view RGB-D human reconstruction with implicit functions is often ...

Please sign up or login with your details

Forgot password? Click here to reset