TransReID: Transformer-based Object Re-Identification

02/08/2021
by   Shuting He, et al.
28

In this paper, we explore the Vision Transformer (ViT), a pure transformer-based model, for the object re-identification (ReID) task. With several adaptations, a strong baseline ViT-BoT is constructed with ViT as backbone, which achieves comparable results to convolution neural networks- (CNN-) based frameworks on several ReID benchmarks. Furthermore, two modules are designed in consideration of the specialties of ReID data: (1) It is super natural and simple for Transformer to encode non-visual information such as camera or viewpoint into vector embedding representations. Plugging into these embeddings, ViT holds the ability to eliminate the bias caused by diverse cameras or viewpoints.(2) We design a Jigsaw branch, parallel with the Global branch, to facilitate the training of the model in a two-branch learning framework. In the Jigsaw branch, a jigsaw patch module is designed to learn robust feature representation and help the training of transformer by shuffling the patches. With these novel modules, we propose a pure-transformer framework dubbed as TransReID, which is the first work to use a pure Transformer for ReID research to the best of our knowledge. Experimental results of TransReID are superior promising, which achieve state-of-the-art performance on both person and vehicle ReID benchmarks.

READ FULL TEXT

page 4

page 7

07/12/2021

GiT: Graph Interactive Transformer for Vehicle Re-identification

Transformers are more and more popular in computer vision, which treat a...
01/04/2022

Short Range Correlation Transformer for Occluded Person Re-Identification

Occluded person re-identification is one of the challenging areas of com...
12/05/2021

Learning Tracking Representations via Dual-Branch Fully Transformer Networks

We present a Siamese-like Dual-branch network based on solely Transforme...
03/14/2022

TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation (WSSS) with only image-level sup...
03/10/2022

Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking

Exploiting a general-purpose neural architecture to replace hand-wired d...
07/11/2018

DeSTNet: Densely Fused Spatial Transformer Networks

Modern Convolutional Neural Networks (CNN) are extremely powerful on a r...
01/03/2022

Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space

This paper explores the feasibility of finding an optimal sub-model from...

Code Repositories

TransReID

[ICCV-2021] TransReID: Transformer-based Object Re-Identification


view repo

re_identification

Fast, Simple and Easy to configure Re-Identification Pipeline in PyTorch


view repo