GIMO: Gaze-Informed Human Motion Prediction in Context

04/20/2022
by   Yang Zheng, et al.
5

Predicting human motion is critical for assistive robots and AR/VR applications, where the interaction with humans needs to be safe and comfortable. Meanwhile, an accurate prediction depends on understanding both the scene context and human intentions. Even though many works study scene-aware human motion prediction, the latter is largely underexplored due to the lack of ego-centric views that disclose human intent and the limited diversity in motion and scenes. To reduce the gap, we propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, as well as ego-centric views with eye gaze that serves as a surrogate for inferring human intent. By employing inertial sensors for motion capture, our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects. We perform an extensive study of the benefits of leveraging eye gaze for ego-centric human motion prediction with various state-of-the-art architectures. Moreover, to realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches. Our network achieves the top performance in human motion prediction on the proposed dataset, thanks to the intent information from the gaze and the denoised gaze feature modulated by the motion. The proposed dataset and our network implementation will be publicly available.

READ FULL TEXT

page 2

page 5

page 8

page 12

page 20

page 21

page 22

page 23

research
11/23/2020

MoGaze: A Dataset of Full-Body Motions that Includes Workspace Geometry and Eye-Gaze

As robots become more present in open human environments, it will become...
research
12/14/2021

EgoBody: Human Body Shape, Motion and Social Interactions from Head-Mounted Devices

Understanding social interactions from first-person views is crucial for...
research
08/31/2022

The Magni Human Motion Dataset: Accurate, Complex, Multi-Modal, Natural, Semantically-Rich and Contextualized

Rapid development of social robots stimulates active research in human m...
research
04/04/2023

Motion-R3: Fast and Accurate Motion Annotation via Representation-based Representativeness Ranking

In this paper, we follow a data-centric philosophy and propose a novel m...
research
05/10/2017

Predicting the Driver's Focus of Attention: the DR(eye)VE Project

In this work we aim to predict the driver's focus of attention. The goal...
research
07/04/2023

ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour

Gaze behaviors such as eye-contact or shared attention are important mar...
research
09/04/2019

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning

This paper addresses a new problem of understanding human gaze communica...

Please sign up or login with your details

Forgot password? Click here to reset