ScarfNet: Multi-scale Features with Deeply Fused and Redistributed Semantics for Enhanced Object Detection

08/01/2019
by   Jin Hyeok Yoo, et al.
8

Convolutional neural network (CNN) has led to significant progress in object detection. In order to detect the objects in various sizes, the object detectors often exploit the hierarchy of the multi-scale feature maps called feature pyramid, which is readily obtained by the CNN architecture. However, the performance of these object detectors is limited since the bottom-level feature maps, which experience fewer convolutional layers, lack the semantic information needed to capture the characteristics of the small objects. In order to address such problem, various methods have been proposed to increase the depth for the bottom-level features used for object detection. While most approaches are based on the generation of additional features through the top-down pathway with lateral connections, our approach directly fuses multi-scale feature maps using bidirectional long short term memory (biLSTM) in effort to generate deeply fused semantics. Then, the resulting semantic information is redistributed to the individual pyramidal feature at each scale through the channel-wise attention model. We integrate our semantic combining and attentive redistribution feature network (ScarfNet) with baseline object detectors, i.e., Faster R-CNN, single-shot multibox detector (SSD) and RetinaNet. Our experiments show that our method outperforms the existing feature pyramid methods as well as the baseline detectors and achieve the state of the art performances in the PASCAL VOC and COCO detection benchmarks.

READ FULL TEXT

page 4

page 8

research
12/09/2016

Feature Pyramid Networks for Object Detection

Feature pyramids are a basic component in recognition systems for detect...
research
05/18/2018

MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects

In order to improve the detection accuracy for objects at different scal...
research
03/04/2022

SFPN: Synthetic FPN for Object Detection

FPN (Feature Pyramid Network) has become a basic component of most SoTA ...
research
09/18/2017

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

One-stage object detectors such as SSD or YOLO already have shown promis...
research
11/20/2018

Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision

The performance of face detectors has been largely improved with the dev...
research
09/27/2019

ASSD: Attentive Single Shot Multibox Detector

This paper proposes a new deep neural network for object detection. The ...

Please sign up or login with your details

Forgot password? Click here to reset