Learning Actions from Human Demonstration Video for Robotic Manipulation

09/10/2019
by   Shuo Yang, et al.
2

Learning actions from human demonstration is an emerging trend for designing intelligent robotic systems, which can be referred as video to command. The performance of such approach highly relies on the quality of video captioning. However, the general video captioning methods focus more on the understanding of the full frame, lacking of consideration on the specific object of interests in robotic manipulations. We propose a novel deep model to learn actions from human demonstration video for robotic manipulation. It consists of two deep networks, grasp detection network (GNet) and video captioning network (CNet). GNet performs two functions: providing grasp solutions and extracting the local features for the object of interests in robotic manipulation. CNet outputs the captioning results by fusing the features of both full frames and local objects. Experimental results on UR5 robotic arm show that our method could produce more accurate command from video demonstration than state-of-the-art work, thereby leading to more robust grasping performance.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
11/14/2022

Multi-Finger Grasping Like Humans

Robots with multi-fingered grippers could perform advanced manipulation ...
research
06/09/2018

Learning to Grasp from a Single Demonstration

Learning-based approaches for robotic grasping using visual sensors typi...
research
03/23/2019

V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation

We propose V2CNet, a new deep learning framework to automatically transl...
research
10/07/2019

Human Action Sequence Classification

This paper classifies human action sequences from videos using a machine...
research
12/20/2020

Towards Complex and Continuous Manipulation: A Gesture Based Anthropomorphic Robotic Hand Design

Most current anthropomorphic robotic hands can realize part of the human...
research
11/09/2020

DIPN: Deep Interaction Prediction Network with Application to Clutter Removal

We propose a Deep Interaction Prediction Network (DIPN) for learning to ...
research
01/16/2013

Deep Learning for Detecting Robotic Grasps

We consider the problem of detecting robotic grasps in an RGB-D view of ...

Please sign up or login with your details

Forgot password? Click here to reset