Object Detection with Mask-based Feature Encoding

by   Xiaochuan Fan, et al.
University of South Carolina

Region-based Convolutional Neural Networks (R-CNNs) have achieved great success in the field of object detection. The existing R-CNNs usually divide a Region-of-Interest (ROI) into grids, and then localize objects by utilizing the spatial information reflected by the relative position of each grid in the ROI. In this paper, we propose a novel feature-encoding approach, where spatial information is represented through the spatial distributions of visual patterns. In particular, we design a Mask Weight Network (MWN) to learn a set of masks and then apply channel-wise masking operations to ROI feature map, followed by a global pooling and a cheap fully-connected layer. We integrate the newly designed feature encoder into the Faster R-CNN architecture. The resulting new Faster R-CNNs can preserve the object-detection accuracy of the standard Faster R-CNNs by using substantially fewer parameters. Compared to R-FCNs using state-of-art PS ROI pooling and deformable PS ROI pooling, the new Faster R-CNNs can produce higher object-detection accuracy with good run-time efficiency. We also show that a specifically designed and learned MWN can capture global contextual information and further improve the object-detection accuracy. Validation experiments are conducted on both PASCAL VOC and MS COCO datasets.


page 4

page 6

page 7


CoupleNet: Coupling Global Structure with Local Parts for Object Detection

The region-based Convolutional Neural Network (CNN) detectors such as Fa...

Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness

Object Detection is critical for automatic military operations. However,...

Auto-Context R-CNN

Region-based convolutional neural networks (R-CNN) fast_rcnn,faster_rcnn...

ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection

In recent years, Convolutional Neural Networks (CNNs) have shown superio...

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

In this paper, we propose deformable deep convolutional neural networks ...

Factorial Convolution Neural Networks

In recent years, GoogleNet has garnered substantial attention as one of ...

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

In this paper, we challenge the common assumption that collapsing the sp...

Please sign up or login with your details

Forgot password? Click here to reset