EventHPE: Event-based 3D Human Pose and Shape Estimation

by   Shihao Zou, et al.

Event camera is an emerging imaging sensor for capturing dynamics of moving objects as events, which motivates our work in estimating 3D human pose and shape from the event signals. Events, on the other hand, have their unique challenges: rather than capturing static body postures, the event signals are best at capturing local motions. This leads us to propose a two-stage deep learning approach, called EventHPE. The first-stage, FlowNet, is trained by unsupervised learning to infer optical flow from events. Both events and optical flow are closely related to human body dynamics, which are fed as input to the ShapeNet in the second stage, to estimate 3D human shapes. To mitigate the discrepancy between image-based flow (optical flow) and shape-based flow (vertices movement of human body shape), a novel flow coherence loss is introduced by exploiting the fact that both flows are originated from the identical human motion. An in-house event-based 3D human dataset is curated that comes with 3D pose and shape annotations, which is by far the largest one to our knowledge. Empirical evaluations on DHP19 dataset and our in-house dataset demonstrate the effectiveness of our approach.


page 7

page 8

page 12


Bootstrapping Human Optical Flow and Pose

We propose a bootstrapping framework to enhance human optical flow and p...

Optical Flow-based 3D Human Motion Estimation from Monocular Video

We present a generative method to estimate 3D human motion and body shap...

Human Pose and Shape Estimation from Single Polarization Images

This paper focuses on a new problem of estimating human pose and shape f...

Learning Optical Flow from Event Camera with Rendered Dataset

We study the problem of estimating optical flow from event cameras. One ...

Motion Equivariant Networks for Event Cameras with the Temporal Normalization Transform

In this work, we propose a novel transformation for events from an event...

Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer

Event camera, as an emerging biologically-inspired vision sensor for cap...

A Temporal Densely Connected Recurrent Network for Event-based Human Pose Estimation

Event camera is an emerging bio-inspired vision sensors that report per-...

Please sign up or login with your details

Forgot password? Click here to reset