Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

by   Zhiqiang Shen, et al.

In this paper, we propose gated recurrent feature pyramid for the problem of learning object detection from scratch. Our approach is motivated by the recent work of deeply supervised object detector (DSOD), but explores new network architecture that dynamically adjusts the supervision intensities of intermediate layers for various scales in object detection. The benefits of the proposed method are two-fold: First, we propose a recurrent feature-pyramid structure to squeeze rich spatial and semantic features into a single prediction layer that further reduces the number of parameters to learn (DSOD need learn 1/2, but our method need only 1/3). Thus our new model is more fit for learning from scratch, and can converge faster than DSOD (using only 50 iterations). Second, we introduce a novel gate-controlled prediction strategy to adaptively enhance or attenuate supervision at different scales based on the input object size. As a result, our model is more suitable for detecting small objects. To the best of our knowledge, our study is the best performed model of learning object detection from scratch. Our method in the PASCAL VOC 2012 comp3 leaderboard (which compares object detectors that are trained only with PASCAL VOC data) demonstrates a significant performance jump, from previous 64 77 method on PASCAL VOC 2007, 2012 and MS COCO datasets, and find that the accuracy of our learning from scratch method can even beat a lot of the state-of-the-art detection methods which use pre-trained models from ImageNet. Code is available at: https://github.com/szq0214/GRP-DSOD .


page 3

page 5

page 8


DSOD: Learning Deeply Supervised Object Detectors from Scratch

We present Deeply Supervised Object Detector (DSOD), a framework that ca...

Crafting GBD-Net for Object Detection

The visual cues from multiple support regions of different sizes and res...

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection

We aim at providing the object detection community with an efficient and...

Object Detection from Scratch with Deep Supervision

We propose Deeply Supervised Object Detectors (DSOD), an object detectio...

Learning Spatial Fusion for Single-Shot Object Detection

Pyramidal feature representation is the common practice to address the c...

iffDetector: Inference-aware Feature Filtering for Object Detection

Modern CNN-based object detectors focus on feature configuration during ...

Dually Supervised Feature Pyramid for Object Detection and Segmentation

Feature pyramid architecture has been broadly adopted in object detectio...

Please sign up or login with your details

Forgot password? Click here to reset