Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection

09/22/2018
by   Ya Jing, et al.
5

Person search with natural language aims to retrieve the corresponding person in an image database by virtue of a describing sentence about the person, which poses great potential for many applications, e.g., video surveillance. Extracting corresponding visual contents to the human description is the key to this cross-modal matching problem. In this paper, we propose a cascade attention network (CAN) to progressively select from person image and text-image similarity. In the CAN, a pose-guided attention is first proposed to attend to the person in the augmented input which concatenates original 3 image channels with another 14 pose confidence maps. With the extracted person image representation, we compute the local similarities between person parts and textual description. Then a similarity-based hard attention is proposed to further select the description-related similarity scores from those local similarities. To verify the effectiveness of our model, we perform extensive experiments on the CUHK Person Description Dataset (CUHK-PEDES) which is currently the only dataset for person search with natural language. Experimental results show that our approach outperforms the state-of-the-art methods by a large margin.

READ FULL TEXT

page 1

page 4

page 7

research
12/06/2019

Visual-Textual Association with Hardest and Semi-Hard Negative Pairs Mining for Person Search

Searching persons in large-scale image databases with the query of natur...
research
06/23/2019

Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments

Description-based person re-identification (Re-id) is an important task ...
research
05/25/2021

TIPCB: A Simple but Effective Part-based Convolutional Baseline for Text-based Person Search

Text-based person search is a sub-task in the field of image retrieval, ...
research
07/14/2023

TVPR: Text-to-Video Person Retrieval and a New Benchmark

Most existing methods for text-based person retrieval focus on text-to-i...
research
03/17/2022

Cascade Transformers for End-to-End Person Search

The goal of person search is to localize a target person from a gallery ...
research
05/06/2021

Person Retrieval in Surveillance Using Textual Query: A Review

Recent advancement of research in biometrics, computer vision, and natur...
research
10/25/2022

Similarity between Units of Natural Language: The Transition from Coarse to Fine Estimation

Capturing the similarities between human language units is crucial for e...

Please sign up or login with your details

Forgot password? Click here to reset