FFAVOD: Feature Fusion Architecture for Video Object Detection

09/15/2021
by   Hughes Perreault, et al.
0

A significant amount of redundancy exists between consecutive frames of a video. Object detectors typically produce detections for one image at a time, without any capabilities for taking advantage of this redundancy. Meanwhile, many applications for object detection work with videos, including intelligent transportation systems, advanced driver assistance systems and video surveillance. Our work aims at taking advantage of the similarity between video frames to produce better detections. We propose FFAVOD, standing for feature fusion architecture for video object detection. We first introduce a novel video object detection architecture that allows a network to share feature maps between nearby frames. Second, we propose a feature fusion module that learns to merge feature maps to enhance them. We show that using the proposed architecture and the fusion module can improve the performance of three base object detectors on two object detection benchmarks containing sequences of moving road users. Additionally, to further increase performance, we propose an improvement to the SpotNet attention module. Using our architecture on the improved SpotNet detector, we obtain the state-of-the-art performance on the UA-DETRAC public benchmark as well as on the UAVDT dataset. Code is available at https://github.com/hu64/FFAVOD.

READ FULL TEXT

page 2

page 5

page 6

research
03/24/2020

RN-VID: A Feature Fusion Architecture for Video Object Detection

Consecutive frames in a video are highly redundant. Therefore, to perfor...
research
07/07/2020

Single Shot Video Object Detector

Single shot detectors that are potentially faster and simpler than two-s...
research
07/21/2022

Boosting 3D Object Detection via Object-Focused Image Fusion

3D object detection has achieved remarkable progress by taking point clo...
research
06/24/2022

Excavating RoI Attention for Underwater Object Detection

Self-attention is one of the most successful designs in deep learning, w...
research
02/27/2018

Recurrent Residual Module for Fast Inference in Videos

Deep convolutional neural networks (CNNs) have made impressive progress ...
research
12/21/2020

HDNET: Exploiting HD Maps for 3D Object Detection

In this paper we show that High-Definition (HD) maps provide strong prio...
research
03/28/2019

Improving Object Detection with Inverted Attention

Improving object detectors against occlusion, blur and noise is a critic...

Please sign up or login with your details

Forgot password? Click here to reset