Beyond Skip Connections: Top-Down Modulation for Object Detection

by   Abhinav Shrivastava, et al.

In recent years, we have seen tremendous progress in the field of object detection. Most of the recent improvements have been achieved by targeting deeper feedforward networks. However, many hard object categories such as bottle, remote, etc. require representation of fine details and not just coarse, semantic representations. But most of these fine details are lost in the early convolutional layers. What we need is a way to incorporate finer details from lower layers into the detection architecture. Skip connections have been proposed to combine high-level and low-level features, but we argue that selecting the right features from low-level requires top-down contextual information. Inspired by the human visual pathway, in this paper we propose top-down modulations as a way to incorporate fine details into the detection framework. Our approach supplements the standard bottom-up, feedforward ConvNet with a top-down modulation (TDM) network, connected using lateral connections. These connections are responsible for the modulation of lower layer filters, and the top-down network handles the selection and integration of contextual information and low-level features. The proposed TDM architecture provides a significant boost on the COCO testdev benchmark, achieving 28.6 AP for VGG16, 35.2 AP for ResNet101, and 37.3 for InceptionResNetv2 network, without any bells and whistles (e.g., multi-scale, iterative box refinement, etc.).


page 1

page 9


Learning to Refine Object Segments

Object segmentation requires both object-level information and low-level...

EDN: Salient Object Detection via Extremely-Downsampled Network

Recent progress on salient object detection (SOD) mainly benefits from m...

Attentional Local Contrast Networks for Infrared Small Target Detection

To mitigate the issue of minimal intrinsic features for pure data-driven...

LC3Net: Ladder context correlation complementary network for salient object detection

Currently, existing salient object detection methods based on convolutio...

Asymmetric Contextual Modulation for Infrared Small Target Detection

Single-frame infrared small target detection remains a challenge not onl...

Left-Right Skip-DenseNets for Coarse-to-Fine Object Categorization

Inspired by the recent neuroscience studies on the left-right asymmetry ...

High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

In this work, we parallelize high-level features in deep networks to sel...

Please sign up or login with your details

Forgot password? Click here to reset