NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection

by   Jiyang Gao, et al.

The labeling cost of large number of bounding boxes is one of the main challenges for training modern object detectors. To reduce the dependence on expensive bounding box annotations, we propose a new semi-supervised object detection formulation, in which a few seed box level annotations and a large scale of image level annotations are used to train the detector. We adopt a training-mining framework, which is widely used in weakly supervised object detection tasks. However, the mining process inherently introduces various kinds of labelling noises: false negatives, false positives and inaccurate boundaries, which can be harmful for training the standard object detectors (e.g. Faster RCNN). We propose a novel NOise Tolerant Ensemble RCNN (NOTE-RCNN) object detector to handle such noisy labels. Comparing to standard Faster RCNN, it contains three highlights: an ensemble of two classification heads and a distillation head to avoid overfitting on noisy labels and improve the mining precision, masking the negative sample loss in box predictor to avoid the harm of false negative labels, and training box regression head only on seed annotations to eliminate the harm from inaccurate boundaries of mined bounding boxes. We evaluate the methods on ILSVRC 2013 and MSCOCO 2017 dataset; we observe that the detection accuracy consistently improves as we iterate between mining and training steps, and state-of-the-art performance is achieved.


page 2

page 4


Noisy Annotation Refinement for Object Detection

Supervised training of object detectors requires well-annotated large-sc...

Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Deep CNN-based object detection systems have achieved remarkable success...

EDF: Ensemble, Distill, and Fuse for Easy Video Labeling

We present a way to rapidly bootstrap object detection on unseen videos ...

Semi-supervised 3D Object Detection with Proficient Teachers

Dominated point cloud-based 3D object detectors in autonomous driving sc...

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

We analyze the DETR-based framework on semi-supervised object detection ...

Learning to Predict the 3D Layout of a Scene

While 2D object detection has improved significantly over the past, real...

Spatial Semantic Regularisation for Large Scale Object Detection

Large scale object detection with thousands of classes introduces the pr...

Please sign up or login with your details

Forgot password? Click here to reset