Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

08/22/2019
by   Christian R. G. Dreher, et al.
0

Recognising human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we argue that bimanual human motion cannot always be described sufficiently with a single label. We therefore present a novel approach for action classification and segmentation by learning object-action relations, while considering the actions executed by each hand individually. Interpreting the scene as a graph of symbolic spatial relations between the hands and objects enables us to train a neural network architecture specifically designed to operate on variable-sized graphs. In order to produce scene graphs, we present a feature extraction pipeline involving human pose estimation and object detection for the calculation of the spatial relations from RGB-D videos. We evaluated the proposed classifier on a new RGB-D video dataset showing daily action sequences focusing on bimanual manipulation actions. It consists of 6 subjects performing 9 tasks with 10 repetitions each, which leads to 540 video recordings with 2 hours and 18 minutes total playtime and per-hand ground truth action labels for each frame. We show that our classifier is able to reliably identify (macro F1-score of 0.86) the true executed action of each hand within its top 3 predictions on a frame-by-frame basis without prior temporal action segmentation.

READ FULL TEXT

page 1

page 3

page 4

page 7

research
07/30/2018

Markerless Visual Robot Programming by Demonstration

In this paper we present an approach for learning to imitate human behav...
research
06/17/2021

BABEL: Bodies, Action and Behavior with English Labels

Understanding the semantics of human movement – the what, how and why of...
research
12/14/2015

Watch-Bot: Unsupervised Learning for Reminding Humans of Forgotten Actions

We present a robotic system that watches a human using a Kinect v2 RGB-D...
research
10/25/2021

A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction

Despite decades of research, understanding human manipulation activities...
research
10/05/2017

A self-organizing neural network architecture for learning human-object interactions

The visual recognition of transitive actions comprising human-object int...
research
09/10/2018

Hand-tremor frequency estimation in videos

We focus on the problem of estimating human hand-tremor frequency from i...
research
04/10/2019

H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions

We present a unified framework for understanding 3D hand and object inte...

Please sign up or login with your details

Forgot password? Click here to reset