Human Action Forecasting by Learning Task Grammars

09/19/2017
by   Tengda Han, et al.
0

For effective human-robot interaction, it is important that a robotic assistant can forecast the next action a human will consider in a given task. Unfortunately, real-world tasks are often very long, complex, and repetitive; as a result forecasting is not trivial. In this paper, we propose a novel deep recurrent architecture that takes as input features from a two-stream Residual action recognition framework, and learns to estimate the progress of human activities from video sequences -- this surrogate progress estimation task implicitly learns a temporal task grammar with respect to which activities can be localized and forecasted. To learn the task grammar, we propose a stacked LSTM based multi-granularity progress estimation framework that uses a novel cumulative Euclidean loss as objective. To demonstrate the effectiveness of our proposed architecture, we showcase experiments on two challenging robotic assistive tasks, namely (i) assembling an Ikea table from its constituents, and (ii) changing the tires of a car. Our results demonstrate that learning task grammars offers highly discriminative cues improving the forecasting accuracy by more than 9 outperforming other competitive schemes.

READ FULL TEXT

page 1

page 5

research
07/02/2020

Attention-Oriented Action Recognition for Real-Time Human-Robot Interaction

Despite the notable progress made in action recognition tasks, not much ...
research
02/26/2018

2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning

Action recognition and human pose estimation are closely related but bot...
research
06/26/2017

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

Skeleton-based human action recognition has attracted a lot of research ...
research
08/12/2021

Spatio-Temporal Human Action Recognition Modelwith Flexible-interval Sampling and Normalization

Human action recognition is a well-known computer vision and pattern rec...
research
12/08/2019

View-invariant Deep Architecture for Human Action Recognition using late fusion

Human action Recognition for unknown views is a challenging task. We pro...
research
06/30/2023

A Personalized Household Assistive Robot that Learns and Creates New Breakfast Options through Human-Robot Interaction

For robots to assist users with household tasks, they must first learn a...

Please sign up or login with your details

Forgot password? Click here to reset