HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations

by   Sadegh Aliakbarian, et al.

Generating both plausible and accurate full body avatar motion is the key to the quality of immersive experiences in mixed reality scenarios. Head-Mounted Devices (HMDs) typically only provide a few input signals, such as head and hands 6-DoF. Recently, different approaches achieved impressive performance in generating full body motion given only head and hands signal. However, to the best of our knowledge, all existing approaches rely on full hand visibility. While this is the case when, e.g., using motion controllers, a considerable proportion of mixed reality experiences do not involve motion controllers and instead rely on egocentric hand tracking. This introduces the challenge of partial hand visibility owing to the restricted field of view of the HMD. In this paper, we propose the first unified approach, HMD-NeMo, that addresses plausible and accurate full body motion generation even when the hands may be only partially visible. HMD-NeMo is a lightweight neural network that predicts the full body motion in an online and real-time fashion. At the heart of HMD-NeMo is the spatio-temporal encoder with novel temporally adaptable mask tokens that encourage plausible motion in the absence of hand observations. We perform extensive analysis of the impact of different components in HMD-NeMo and introduce a new state-of-the-art on AMASS dataset through our evaluation.


page 6

page 13

page 14

page 15

page 16

page 19

page 20

page 21


BoDiffusion: Diffusing Sparse Observations for Full-Body Human Motion Synthesis

Mixed reality applications require tracking the user's full-body motion ...

QuestSim: Human Motion Tracking from Sparse Sensors with Simulated Avatars

Real-time tracking of human body motion is crucial for interactive and i...

Realistic Full-Body Tracking from Sparse Observations via Joint-Level Modeling

To bridge the physical and virtual worlds for rapidly developed VR/AR ap...

Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model

With the recent surge in popularity of AR/VR applications, realistic and...

FLAG: Flow-based 3D Avatar Generation from Sparse Observations

To represent people in mixed reality applications for collaboration and ...

GOAL: Generating 4D Whole-Body Motion for Hand-Object Grasping

Generating digital humans that move realistically has many applications ...

Please sign up or login with your details

Forgot password? Click here to reset