Training with Confusion for Fine-Grained Visual Classification
Research in Fine-Grained Visual Classification has focused on tackling the variations in pose, lighting, and viewpoint using sophisticated localization and segmentation techniques, and the usage of robust texture features to improve performance. In this work, we look at the fundamental optimization of neural network training for fine-grained classification tasks with minimal inter-class variance, and attempt to learn features with increased generalization to prevent overfitting. We introduce Training-with-Confusion, an optimization procedure for fine-grained classification tasks that regularizes training by introducing confusion in activations. Our method can be generalized to any fine-tuning task; it is robust to the presence of small training sets and label noise; and adds no overhead to the prediction time. We find that Training-with-Confusion improves the state-of-the-art on all major fine-grained classification datasets.
READ FULL TEXT