Relation-aware Compositional Zero-shot Learning for Attribute-Object Pair Recognition
This paper proposes a novel model for recognizing images with composite attribute-object concepts, notably for composite concepts that are unseen during model training. We aim to explore the three key properties required by the task — relation-aware, consistent, and decoupled — to learn rich and robust features for primitive concepts that compose attribute-object pairs. To this end, we propose the Blocked Message Passing Network (BMP-Net). The model consists of two modules. The concept module generates semantically meaningful features for primitive concepts, whereas the visual module extracts visual features for attributes and objects from input images. A message passing mechanism is used in the concept module to capture the relations between primitive concepts. Furthermore, to prevent the model from being biased towards seen composite concepts and reduce the entanglement between attributes and objects, we propose a blocking mechanism that equalizes the information available to the model for both seen and unseen concepts. Extensive experiments and ablation studies on two benchmarks show the efficacy of the proposed model.
READ FULL TEXT