Egocentric Prediction of Action Target in 3D

03/24/2022
by   Yiming Li, et al.
1

We are interested in anticipating as early as possible the target location of a person's object manipulation action in a 3D workspace from egocentric vision. It is important in fields like human-robot collaboration, but has not yet received enough attention from vision and learning communities. To stimulate more research on this challenging egocentric vision task, we propose a large multimodality dataset of more than 1 million frames of RGB-D and IMU streams, and provide evaluation metrics based on our high-quality 2D and 3D labels from semi-automatic annotation. Meanwhile, we design baseline methods using recurrent neural networks and conduct various ablation studies to validate their effectiveness. Our results demonstrate that this new task is worthy of further study by researchers in robotics, vision, and learning communities.

READ FULL TEXT

page 1

page 5

page 8

page 11

page 12

research
08/19/2009

Semantic Robot Vision Challenge: Current State and Future Directions

The Semantic Robot Vision Competition provided an excellent opportunity ...
research
01/31/2019

Characterizing Input Methods for Human-to-robot Demonstrations

Human demonstrations are important in a range of robotics applications, ...
research
02/28/2018

Anticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction

Close human-robot cooperation is a key enabler for new developments in a...
research
02/14/2019

Predicting Ergonomic Risks During Indoor Object Manipulation Using Spatiotemporal Convolutional Networks

Automated real-time prediction of the ergonomic risks of manipulating ob...
research
07/26/2023

Event-based Vision for Early Prediction of Manipulation Actions

Neuromorphic visual sensors are artificial retinas that output sequences...
research
09/08/2021

Learning to Discriminate Information for Online Action Detection: Analysis and Application

Online action detection, which aims to identify an ongoing action from a...
research
07/21/2019

Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures

Action detection and recognition tasks have been the target of much focu...

Please sign up or login with your details

Forgot password? Click here to reset