DeepMoCap: Deep Optical Motion Capture Using Multiple Depth Sensors and Retro-Reflectors

by   Anargyros Chatzitofis, et al.

In this paper, a marker-based, single-person optical motion capture method (DeepMoCap) is proposed using multiple spatio-temporally aligned infrared-depth sensors and retro-reflective straps and patches (reflectors). DeepMoCap explores motion capture by automatically localizing and labeling reflectors on depth images and, subsequently, on 3D space. Introducing a non-parametric representation to encode the temporal correlation among pairs of colorized depthmaps and 3D optical flow frames, a multi-stage Fully Convolutional Network (FCN) architecture is proposed to jointly learn reflector locations and their temporal dependency among sequential frames. The extracted reflector 2D locations are spatially mapped in 3D space, resulting in robust 3D optical data extraction. The subject's motion is efficiently captured by applying a template-based fitting technique on the extracted optical data. Two datasets have been created and made publicly available for evaluation purposes; one comprising multi-view depth and 3D optical flow annotated images (DMC2.5D), and a second, consisting of spatio-temporally aligned multi-view depth images along with skeleton, inertial and ground truth MoCap data (DMC3D). The FCN model outperforms its competitors on the DMC2.5D dataset using 2D Percentage of Correct Keypoints (PCK) metric, while the motion capture outcome is evaluated against RGB-D and inertial data fusion approaches on DMC3D, outperforming the next best method by 4.5



There are no comments yet.


page 14

page 17

page 19

page 20

page 23

page 24

page 25

page 26


Learning Multi-Human Optical Flow

The optical flow of humans is well known to be useful for the analysis o...

Asymmetric Bilateral Phase Correlation for Optical Flow Estimation in the Frequency Domain

We address the problem of motion estimation in images operating in the f...

Flow-Motion and Depth Network for Monocular Stereo and Beyond

We propose a learning-based method that solves monocular stereo and can ...

FILM: Frame Interpolation for Large Motion

We present a frame interpolation algorithm that synthesizes multiple int...

STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling

We propose a novel superpixel-based multi-view convolutional neural netw...

Temporal Unknown Incremental Clustering (TUIC) Model for Analysis of Traffic Surveillance Videos

Optimized scene representation is an important characteristic of a frame...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.