Video Relation Detection via Tracklet based Visual Transformer

08/19/2021
by   Kaifeng Gao, et al.
0

Video Visual Relation Detection (VidVRD), has received significant attention of our community over recent years. In this paper, we apply the state-of-the-art video object tracklet detection pipeline MEGA and deepSORT to generate tracklet proposals. Then we perform VidVRD in a tracklet-based manner without any pre-cutting operations. Specifically, we design a tracklet-based visual Transformer. It contains a temporal-aware decoder which performs feature interactions between the tracklets and learnable predicate query embeddings, and finally predicts the relations. Experimental results strongly demonstrate the superiority of our method, which outperforms other methods by a large margin on the Video Relation Understanding (VRU) Grand Challenge in ACM Multimedia 2021. Codes are released at https://github.com/Dawn-LX/VidVRD-tracklets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2021

Video Relation Detection with Trajectory-aware Multi-modal Features

Video relation detection problem refers to the detection of the relation...
research
05/05/2018

Revisiting Temporal Modeling for Video-based Person ReID

Video-based person reID is an important task, which has received much at...
research
04/30/2021

Few-Shot Video Object Detection

We introduce Few-Shot Video Object Detection (FSVOD) with three importan...
research
11/18/2022

Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

This paper deals with the problem of localizing objects in image and vid...
research
06/28/2021

Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection

With rapidly evolving internet technologies and emerging tools, sports r...
research
08/18/2021

Social Fabric: Tubelet Compositions for Video Relation Detection

This paper strives to classify and detect the relationship between objec...
research
06/07/2021

Visual Transformer for Task-aware Active Learning

Pool-based sampling in active learning (AL) represents a key framework f...

Please sign up or login with your details

Forgot password? Click here to reset