Learning Reinforced Attentional Representation for End-to-End Visual Tracking

08/27/2019
by   Peng Gao, et al.
14

Despite the fact that tremendous advances have been made by numerous recent tracking approaches in the last decade, how to achieve high-performance visual tracking is still an open problem. In this paper, we propose an end-to-end network model to learn reinforced attentional representation for accurate target object discrimination and localization. We utilize a novel hierarchical attentional module with long short-term memory and multi-layer perceptrons to leverage both inter- and intra-frame attention to effectively facilitate visual pattern emphasis. Moreover, we incorporate a contextual attentional correlation filter into the backbone network to make our model be trained in an end-to-end fashion. Our proposed approach not only takes full advantage of informative geometries and semantics, but also updates correlation filters online without the backbone network fine-tuning to enable adaptation of target appearance variations. Extensive experiments conducted on several popular benchmark datasets demonstrate the effectiveness and efficiency of our proposed approach while remaining computational efficiency.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 16

page 17

research
07/07/2017

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Object tracking is challenging as target objects often undergo drastic a...
research
08/01/2017

CREST: Convolutional Residual Learning for Visual Tracking

Discriminative correlation filters (DCFs) have been shown to perform sup...
research
08/13/2017

Recurrent Filter Learning for Visual Tracking

Recently using convolutional neural networks (CNNs) has gained popularit...
research
11/28/2018

A Generative Appearance Model for End-to-end Video Object Segmentation

One of the fundamental challenges in video object segmentation is to fin...
research
04/23/2019

Siamese Attentional Keypoint Network for High Performance Visual Tracking

In this paper, we investigate impacts of three main aspects of visual tr...
research
06/13/2023

E2E-LOAD: End-to-End Long-form Online Action Detection

Recently, there has been a growing trend toward feature-based approaches...
research
05/31/2017

Long-term Correlation Tracking using Multi-layer Hybrid Features in Sparse and Dense Environments

Tracking a target of interest in both sparse and crowded environments is...

Please sign up or login with your details

Forgot password? Click here to reset