Learning Semantic-Aligned Feature Representation for Text-based Person Search

12/13/2021
by   Shiping Li, et al.
0

Text-based person search aims to retrieve images of a certain pedestrian by a textual description. The key challenge of this task is to eliminate the inter-modality gap and achieve the feature alignment across modalities. In this paper, we propose a semantic-aligned embedding method for text-based person search, in which the feature alignment across modalities is achieved by automatically learning the semantic-aligned visual features and textual features. First, we introduce two Transformer-based backbones to encode robust feature representations of the images and texts. Second, we design a semantic-aligned feature aggregation network to adaptively select and aggregate features with the same semantics into part-aware features, which is achieved by a multi-head attention module constrained by a cross-modality part alignment loss and a diversity loss. Experimental results on the CUHK-PEDES and Flickr30K datasets show that our method achieves state-of-the-art performances.

READ FULL TEXT
research
05/25/2021

TIPCB: A Simple but Effective Part-based Convolutional Baseline for Text-based Person Search

Text-based person search is a sub-task in the field of image retrieval, ...
research
07/16/2022

Learning Granularity-Unified Representations for Text-to-Image Person Re-identification

Text-to-image person re-identification (ReID) aims to search for pedestr...
research
08/30/2022

Image-Specific Information Suppression and Implicit Local Alignment for Text-based Person Search

Text-based person search is a challenging task that aims to search pedes...
research
08/10/2021

ASMR: Learning Attribute-Based Person Search with Adaptive Semantic Margin Regularizer

Attribute-based person search is the task of finding person images that ...
research
11/16/2022

Person Text-Image Matching via Text-Feature Interpretability Embedding and External Attack Node Implantation

Person text-image matching, also known as text based person search, aims...
research
08/19/2023

Noisy-Correspondence Learning for Text-to-Image Person Re-identification

Text-to-image person re-identification (TIReID) is a compelling topic in...
research
01/08/2021

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

Text-based person search aims at retrieving target person in an image ga...

Please sign up or login with your details

Forgot password? Click here to reset