Semantic Decomposition and Recognition of Long and Complex Manipulation Action Sequences

10/18/2016
by   Eren Erdal Aksoy, et al.
0

Understanding continuous human actions is a non-trivial but important problem in computer vision. Although there exists a large corpus of work in the recognition of action sequences, most approaches suffer from problems relating to vast variations in motions, action combinations, and scene contexts. In this paper, we introduce a novel method for semantic segmentation and recognition of long and complex manipulation action tasks, such as "preparing a breakfast" or "making a sandwich". We represent manipulations with our recently introduced "Semantic Event Chain" (SEC) concept, which captures the underlying spatiotemporal structure of an action invariant to motion, velocity, and scene context. Solely based on the spatiotemporal interactions between manipulated objects and hands in the extracted SEC, the framework automatically parses individual manipulation streams performed either sequentially or concurrently. Using event chains, our method further extracts basic primitive elements of each parsed manipulation. Without requiring any prior object knowledge, the proposed framework can also extract object-like scene entities that exhibit the same role in semantically similar manipulations. We conduct extensive experiments on various recent datasets to validate the robustness of the framework.

READ FULL TEXT

page 11

page 12

page 13

page 14

page 15

page 17

page 19

page 20

research
12/04/2015

Learning the Semantics of Manipulation Action

In this paper we present a formal computational framework for modeling m...
research
05/16/2023

Learning Higher-order Object Interactions for Keypoint-based Video Understanding

Action recognition is an important problem that requires identifying act...
research
01/09/2023

Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments

Synthesizing interaction-involved human motions has been challenging due...
research
07/26/2023

Event-based Vision for Early Prediction of Manipulation Actions

Neuromorphic visual sensors are artificial retinas that output sequences...
research
01/30/2022

You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration

Promising results have been achieved recently in category-level manipula...
research
03/22/2022

Semantic State Estimation in Cloth Manipulation Tasks

Understanding of deformable object manipulations such as textiles is a c...
research
07/22/2018

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes

This paper proposes a novel method for understanding daily hand-object m...

Please sign up or login with your details

Forgot password? Click here to reset