Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images

03/08/2019
by   Mia Kokic, et al.
0

We develop a system for modeling hand-object interactions in 3D from RGB images that show a hand which is holding a novel object from a known category. We design a Convolutional Neural Network (CNN) for Hand-held Object Pose and Shape estimation called HOPS-Net and utilize prior work to estimate the hand pose and configuration. We leverage the insight that information about the hand facilitates object pose and shape estimation by incorporating the hand into both training and inference of the object pose and shape as well as the refinement of the estimated pose. The network is trained on a large synthetic dataset of objects in interaction with a human hand. To bridge the gap between real and synthetic images, we employ an image-to-image translation model (Augmented CycleGAN) that generates realistically textured objects given a synthetic rendering. This provides a scalable way of generating annotated data for training HOPS-Net. Our quantitative experiments show that even noisy hand parameters significantly help object pose and shape estimation. The qualitative experiments show results of pose and shape estimation of objects held by a hand "in the wild".

READ FULL TEXT

page 1

page 7

research
05/04/2023

Learning Hand-Held Object Reconstruction from In-The-Wild Videos

Prior works for reconstructing hand-held objects from a single image rel...
research
08/21/2023

CHORD: Category-level Hand-held Object Reconstruction via Shape Deformation

In daily life, humans utilize hands to manipulate objects. Modeling the ...
research
11/18/2022

A mixed-reality dataset for category-level 6D pose and size estimation of hand-occluded containers

Estimating the 6D pose and size of household containers is challenging d...
research
03/31/2020

HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation

Hand-object pose estimation (HOPE) aims to jointly detect the poses of b...
research
08/16/2023

DDF-HO: Hand-Held Object Reconstruction via Conditional Directed Distance Field

Reconstructing hand-held objects from a single RGB image is an important...
research
08/04/2015

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

In this work we address the problem of indoor scene understanding from R...
research
02/16/2015

Inferring 3D Object Pose in RGB-D Images

The goal of this work is to replace objects in an RGB-D scene with corre...

Please sign up or login with your details

Forgot password? Click here to reset