An Analysis of Pre-Training on Object Detection

by   Hengduo Li, et al.

We provide a detailed analysis of convolutional neural networks which are pre-trained on the task of object detection. To this end, we train detectors on large datasets like OpenImagesV4, ImageNet Localization and COCO. We analyze how well their features generalize to tasks like image classification, semantic segmentation and object detection on small datasets like PASCAL-VOC, Caltech-256, SUN-397, Flowers-102 etc. Some important conclusions from our analysis are --- 1) Pre-training on large detection datasets is crucial for fine-tuning on small detection datasets, especially when precise localization is needed. For example, we obtain 81.1 IoU after pre-training on OpenImagesV4, which is 7.6 proposed DeformableConvNetsV2 which uses ImageNet pre-training. 2) Detection pre-training also benefits other localization tasks like semantic segmentation but adversely affects image classification. 3) Features for images (like avg. pooled Conv5) which are similar in the object detection feature space are likely to be similar in the image classification feature space but the converse is not true. 4) Visualization of features reveals that detection neurons have activations over an entire object, while activations for classification networks typically focus on parts. Therefore, detection networks are poor at classification when multiple instances are present in an image or when an instance only covers a small fraction of an image.


page 4

page 5

page 6

page 8


DAP: Detection-Aware Pre-training with Weak Supervision

This paper presents a detection-aware pre-training (DAP) approach, which...

Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training of Image Segmentation Models

While fine-tuning pre-trained networks has become a popular way to train...

A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis

Automatic analysis of scanned historical documents comprises a wide rang...

Region Proposal Network Pre-Training Helps Label-Efficient Object Detection

Self-supervised pre-training, based on the pretext task of instance disc...

Co-localization with Category-Consistent CNN Features and Geodesic Distance Propagation

Co-localization is the problem of localizing objects of the same class u...

Image is First-order Norm+Linear Autoregressive

This paper reveals that every image can be understood as a first-order n...

RevColV2: Exploring Disentangled Representations in Masked Image Modeling

Masked image modeling (MIM) has become a prevalent pre-training setup fo...

Please sign up or login with your details

Forgot password? Click here to reset