Detecting Hands and Recognizing Physical Contact in the Wild

by   Supreeth Narasimhaswamy, et al.

We investigate a new problem of detecting hands and recognizing their physical contact state in unconstrained conditions. This is a challenging inference task given the need to reason beyond the local appearance of hands. The lack of training annotations indicating which object or parts of an object the hand is in contact with further complicates the task. We propose a novel convolutional network based on Mask-RCNN that can jointly learn to localize hands and predict their physical contact to address this problem. The network uses outputs from another object detector to obtain locations of objects present in the scene. It uses these outputs and hand locations to recognize the hand's contact state using two attention mechanisms. The first attention mechanism is based on the hand and a region's affinity, enclosing the hand and the object, and densely pools features from this region to the hand region. The second attention module adaptively selects salient features from this plausible region of contact. To develop and evaluate our method's performance, we introduce a large-scale dataset called ContactHands, containing unconstrained images annotated with hand locations and contact states. The proposed network, including the parameters of attention modules, is end-to-end trainable. This network achieves approximately 7% relative improvement over a baseline network that was built on the vanilla Mask-RCNN architecture and trained for recognizing hand contact states.


page 3

page 6

page 9


Contextual Attention for Hand Detection in the Wild

We present Hand-CNN, a novel convolutional network architecture for dete...

Detecting Human-Object Contact in Images

Humans constantly contact objects to move and perform tasks. Thus, detec...

Learning joint reconstruction of hands and manipulated objects

Estimating hand-object manipulations is essential for interpreting and i...

HO-3D_v3: Improving the Accuracy of Hand-Object Annotations of the HO-3D Dataset

HO-3D is a dataset providing image sequences of various hand-object inte...

Understanding Human Hands in Contact at Internet Scale

Hands are the central means by which humans manipulate their world and b...

Nonrigid Object Contact Estimation With Regional Unwrapping Transformer

Acquiring contact patterns between hands and nonrigid objects is a commo...

Context-Aware Synthesis and Placement of Object Instances

Learning to insert an object instance into an image in a semantically co...

Please sign up or login with your details

Forgot password? Click here to reset