AIParsing: Anchor-free Instance-level Human Parsing

07/14/2022
by   Sanyi Zhang, et al.
0

Most state-of-the-art instance-level human parsing models adopt two-stage anchor-based detectors and, therefore, cannot avoid the heuristic anchor box design and the lack of analysis on a pixel level. To address these two issues, we have designed an instance-level human parsing network which is anchor-free and solvable on a pixel level. It consists of two simple sub-networks: an anchor-free detection head for bounding box predictions and an edge-guided parsing head for human segmentation. The anchor-free detector head inherits the pixel-like merits and effectively avoids the sensitivity of hyper-parameters as proved in object detection applications. By introducing the part-aware boundary clue, the edge-guided parsing head is capable to distinguish adjacent human parts from among each other up to 58 parts in a single human instance, even overlapping instances. Meanwhile, a refinement head integrating box-level score and part-level parsing quality is exploited to improve the quality of the parsing results. Experiments on two multiple human parsing datasets (i.e., CIHP and LV-MHP-v2.0) and one video instance-level human parsing dataset (i.e., VIP) show that our method achieves the best global-level and instance-level performance over state-of-the-art one-stage top-down alternatives.

READ FULL TEXT

page 2

page 4

page 6

page 7

page 8

page 9

page 12

page 13

research
04/02/2019

FCOS: Fully Convolutional One-Stage Object Detection

We propose a fully convolutional one-stage object detector (FCOS) to sol...
research
06/02/2021

Translational Symmetry-Aware Facade Parsing for 3D Building Reconstruction

Effectively parsing the facade is essential to 3D building reconstructio...
research
11/28/2021

CDGNet: Class Distribution Guided Network for Human Parsing

The objective of human parsing is to partition a human in an image into ...
research
05/11/2020

Scope Head for Accurate Localizationin Object Detection

Existing anchor-based and anchor-free object detectors in multi-stage or...
research
08/02/2018

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

Beyond the existing single-person and multiple-person human parsing task...
research
07/29/2019

Consensus Feature Network for Scene Parsing

Scene parsing is challenging as it aims to assign one of the semantic ca...
research
09/22/2019

Double Anchor R-CNN for Human Detection in a Crowd

Detecting human in a crowd is a challenging problem due to the uncertain...

Please sign up or login with your details

Forgot password? Click here to reset