BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification

09/09/2023
by   Takuro Fujii, et al.
0

Text-based person re-identification (TBPReID) aims to retrieve person images represented by a given textual query. In this task, how to effectively align images and texts globally and locally is a crucial challenge. Recent works have obtained high performances by solving Masked Language Modeling (MLM) to align image/text parts. However, they only performed uni-directional (i.e., from image to text) local-matching, leaving room for improvement by introducing opposite-directional (i.e., from text to image) local-matching. In this work, we introduce Bidirectional Local-Matching (BiLMa) framework that jointly optimize MLM and Masked Image Modeling (MIM) in TBPReID model training. With this framework, our model is trained so as the labels of randomly masked both image and text tokens are predicted by unmasked tokens. In addition, to narrow the semantic gap between image and text in MIM, we propose Semantic MIM (SemMIM), in which the labels of masked image tokens are automatically given by a state-of-the-art human parser. Experimental results demonstrate that our BiLMa framework with SemMIM achieves state-of-the-art Rank@1 and mAP scores on three benchmarks.

READ FULL TEXT

page 4

page 6

research
03/22/2023

Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval

Text-to-image person retrieval aims to identify the target person based ...
research
09/04/2023

Unified Pre-training with Pseudo Texts for Text-To-Image Person Re-identification

The pre-training task is indispensable for the text-to-image person re-i...
research
09/18/2023

CLIP-based Synergistic Knowledge Transfer for Text-based Person Retrieval

Text-based Person Retrieval aims to retrieve the target person images gi...
research
03/31/2018

Human Semantic Parsing for Person Re-identification

Person re-identification is a challenging task mainly due to factors suc...
research
07/27/2021

Semantically Self-Aligned Network for Text-to-Image Part-aware Person Re-identification

Text-to-image person re-identification (ReID) aims to search for images ...
research
06/07/2021

Person Re-Identification with a Locally Aware Transformer

Person Re-Identification is an important problem in computer vision-base...
research
10/12/2020

Top-DB-Net: Top DropBlock for Activation Enhancement in Person Re-Identification

Person Re-Identification is a challenging task that aims to retrieve all...

Please sign up or login with your details

Forgot password? Click here to reset