SALISA: Saliency-based Input Sampling for Efficient Video Object Detection

04/05/2022
by   Babak Ehteshami Bejnordi, et al.
0

High-resolution images are widely adopted for high-performance object detection in videos. However, processing high-resolution inputs comes with high computation costs, and naive down-sampling of the input to reduce the computation costs quickly degrades the detection performance. In this paper, we propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection that allows for heavy down-sampling of unimportant background regions while preserving the fine-grained details of a high-resolution image. The resulting image is spatially smaller, leading to reduced computational costs while enabling a performance comparable to a high-resolution input. To achieve this, we propose a differentiable resampling module based on a thin plate spline spatial transformer network (TPS-STN). This module is regularized by a novel loss to provide an explicit supervision signal to learn to "magnify" salient regions. We report state-of-the-art results in the low compute regime on the ImageNet-VID and UA-DETRAC video object detection datasets. We demonstrate that on both datasets, the mAP of an EfficientDet-D1 (EfficientDet-D2) gets on par with EfficientDet-D2 (EfficientDet-D3) at a much lower computational cost. We also show that SALISA significantly improves the detection of small objects. In particular, SALISA with an EfficientDet-D1 detector improves the detection of small objects by 77%, and remarkably also outperforms EfficientDetD3 baseline.

READ FULL TEXT

page 6

page 9

page 13

page 14

research
08/20/2019

Towards High-Resolution Salient Object Detection

Deep neural network based methods have made a significant breakthrough i...
research
03/27/2023

Learning to Zoom and Unzoom

Many perception systems in mobile computing, autonomous navigation, and ...
research
08/27/2021

FOVEA: Foveated Image Magnification for Autonomous Navigation

Efficient processing of high-resolution video streams is safety-critical...
research
12/20/2021

UFPMP-Det: Toward Accurate and Efficient Object Detection on Drone Imagery

This paper proposes a novel approach to object detection on drone imager...
research
10/24/2018

Fast and accurate object detection in high resolution 4K and 8K video using GPUs

Machine learning has celebrated a lot of achievements on computer vision...
research
02/08/2019

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

In vision-enabled autonomous systems such as robots and autonomous cars,...
research
09/10/2018

Learning to Zoom: a Saliency-Based Sampling Layer for Neural Networks

We introduce a saliency-based distortion layer for convolutional neural ...

Please sign up or login with your details

Forgot password? Click here to reset