HAT: Hierarchical Aggregation Transformers for Person Re-identification

07/13/2021
by   Guowen Zhang, et al.
11

Recently, with the advance of deep Convolutional Neural Networks (CNNs), person Re-Identification (Re-ID) has witnessed great success in various applications. However, with limited receptive fields of CNNs, it is still challenging to extract discriminative representations in a global view for persons under non-overlapped cameras. Meanwhile, Transformers demonstrate strong abilities of modeling long-range dependencies for spatial and sequential data. In this work, we take advantages of both CNNs and Transformers, and propose a novel learning framework named Hierarchical Aggregation Transformer (HAT) for image-based person Re-ID with high performance. To achieve this goal, we first propose a Deeply Supervised Aggregation (DSA) to recurrently aggregate hierarchical features from CNN backbones. With multi-granularity supervisions, the DSA can enhance multi-scale features for person retrieval, which is very different from previous methods. Then, we introduce a Transformer-based Feature Calibration (TFC) to integrate low-level detail information as the global prior for high-level semantic information. The proposed TFC is inserted to each level of hierarchical features, resulting in great performance improvements. To our best knowledge, this work is the first to take advantages of both CNNs and Transformers for image-based person Re-ID. Comprehensive experiments on four large-scale Re-ID benchmarks demonstrate that our method shows better results than several state-of-the-art methods. The code is released at https://github.com/AI-Zhpp/HAT.

READ FULL TEXT

page 1

page 2

page 4

page 8

research
04/27/2023

Deeply-Coupled Convolution-Transformer with Spatial-temporal Complementary Learning for Video-based Person Re-identification

Advanced deep Convolutional Neural Networks (CNNs) have shown great succ...
research
04/05/2021

A Video Is Worth Three Views: Trigeminal Transformers for Video-based Person Re-identification

Video-based person re-identification (Re-ID) aims to retrieve video sequ...
research
06/07/2021

Person Re-Identification with a Locally Aware Transformer

Person Re-Identification is an important problem in computer vision-base...
research
04/07/2022

PSTR: End-to-End One-Step Person Search With Transformers

We propose a novel one-step transformer-based person search framework, P...
research
07/19/2019

Interaction-and-Aggregation Network for Person Re-identification

Person re-identification (reID) benefits greatly from deep convolutional...
research
04/08/2022

Vision Transformers for Single Image Dehazing

Image dehazing is a representative low-level vision task that estimates ...
research
04/19/2023

Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

Generalizable person re-identification (Re-ID) is a very hot research to...

Please sign up or login with your details

Forgot password? Click here to reset