Beyond Holistic Object Recognition: Enriching Image Understanding with Part States

12/15/2016
by   Cewu Lu, et al.
0

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level. Based upon previous work on part localization, in this paper, we address the problem of inferring rich semantics imparted by an object part in still images. We propose to tokenize the semantic space as a discrete set of part states. Our modeling of part state is spatially localized, therefore, we formulate the part state inference problem as a pixel-wise annotation problem. An iterative part-state inference neural network is specifically designed for this task, which is efficient in time and accurate in performance. Extensive experiments demonstrate that the proposed method can effectively predict the semantic states of parts and simultaneously correct localization errors, thus benefiting a few visual understanding applications. The other contribution of this paper is our part state dataset which contains rich part-level semantic annotations.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

research
07/05/2016

Learning the semantic structure of objects from Web supervision

While recent research in image understanding has often focused on recogn...
research
06/20/2021

Exploring Semantic Relationships for Unpaired Image Captioning

Recently, image captioning has aroused great interest in both academic a...
research
12/29/2019

Fine-grained Object Semantic Understanding from Correspondences

Fine-grained semantic understanding of 3D objects is crucial in many app...
research
05/17/2018

Identifying Object States in Cooking-Related Images

Understanding object states is as important as object recognition for ro...
research
08/08/2022

Boosting Video-Text Retrieval with Explicit High-Level Semantics

Video-text retrieval (VTR) is an attractive yet challenging task for mul...
research
09/05/2019

Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation

Recently, automatic image caption generation has been an important focus...
research
04/18/2023

Incremental Image Labeling via Iterative Refinement

Data quality is critical for multimedia tasks, while various types of sy...

Please sign up or login with your details

Forgot password? Click here to reset