Efficient Visual Tracking via Hierarchical Cross-Attention Transformer

03/25/2022
by   Xin Chen, et al.
0

In recent years, target tracking has made great progress in accuracy. This development is mainly attributed to powerful networks (such as transformers) and additional modules (such as online update and refinement modules). However, less attention has been paid to tracking speed. Most state-of-the-art trackers are satisfied with the real-time speed on powerful GPUs. However, practical applications necessitate higher requirements for tracking speed, especially when edge platforms with limited resources are used. In this work, we present an efficient tracking method via a hierarchical cross-attention transformer named HCAT. Our model runs about 195 fps on GPU, 45 fps on CPU, and 55 fps on the edge AI platform of NVidia Jetson AGX Xavier. Experiments show that our HCAT achieves promising results on LaSOT, GOT-10k, TrackingNet, NFS, OTB100, UAV123, and VOT2020. Code and models are available at https://github.com/chenxin-dlut/HCAT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2021

Efficient Visual Tracking with Exemplar Transformers

The design of more complex and powerful neural network models has signif...
research
05/11/2023

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

Vision transformers have shown great success due to their high model cap...
research
09/17/2023

LiteTrack: Layer Pruning with Asynchronous Feature Extraction for Lightweight and Efficient Visual Tracking

The recent advancements in transformer-based visual trackers have led to...
research
08/14/2023

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

Transformer-based visual trackers have demonstrated significant progress...
research
08/09/2022

CoViT: Real-time phylogenetics for the SARS-CoV-2 pandemic using Vision Transformers

Real-time viral genome detection, taxonomic classification and phylogene...
research
07/31/2021

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Most existing Siamese-based tracking methods execute the classification ...
research
01/28/2023

Cross-domain Neural Pitch and Periodicity Estimation

Pitch is a foundational aspect of our perception of audio signals. Pitch...

Please sign up or login with your details

Forgot password? Click here to reset