Video Relation Detection with Trajectory-aware Multi-modal Features

01/20/2021
by   Wentao Xie, et al.
0

Video relation detection problem refers to the detection of the relationship between different objects in videos, such as spatial relationship and action relationship. In this paper, we present video relation detection with trajectory-aware multi-modal features to solve this task. Considering the complexity of doing visual relation detection in videos, we decompose this task into three sub-tasks: object detection, trajectory proposal and relation prediction. We use the state-of-the-art object detection method to ensure the accuracy of object trajectory detection and multi-modal feature representation to help the prediction of relation between objects. Our method won the first place on the video relation detection task of Video Relation Understanding Grand Challenge in ACM Multimedia 2020 with 11.74% mAP, which surpasses other methods by a large margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2021

Video Relation Detection via Tracklet based Visual Transformer

Video Visual Relation Detection (VidVRD), has received significant atten...
research
09/04/2021

Multi-modal Representation Learning for Video Advertisement Content Structuring

Video advertisement content structuring aims to segment a given video ad...
research
09/23/2021

Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark

Recognizing the emotional state of people is a basic but challenging tas...
research
02/27/2017

Visual Translation Embedding Network for Visual Relation Detection

Visual relations, such as "person ride bike" and "bike next to car", off...
research
08/23/2022

DeepInteraction: 3D Object Detection via Modality Interaction

Existing top-performance 3D object detectors typically rely on the multi...
research
08/26/2019

Relation Distillation Networks for Video Object Detection

It has been well recognized that modeling object-to-object relations wou...
research
01/05/2023

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

Narrated "how-to" videos have emerged as a promising data source for a w...

Please sign up or login with your details

Forgot password? Click here to reset