Learning Semantically Enhanced Feature for Fine-Grained Image Classification
We target at providing a computational cheap yet effective approach for fine-grained image classification (FGIC) in this paper. Compared to previous methods that armed with a sophisticated part localization module for fine-grained feature learning, our approach attains this function by improving the semantics of sub-features of a global feature. To this end, we first achieve the sub-feature semantic by rearranging feature channels of a CNN into different groups through channel permutation, which is implicitly realized without the need of modifying backbone network structures. A weighted combination regularization derived from matching prediction distributions between the global feature and its sub-features is then employed to guide the learned groups to be activated on local parts with strong discriminability, thus increasing the discriminability of the global feature in fine-grained scales. Our approach brings negligible extra parameters to the backbone CNNs and can be implemented as a plug-and-play module as well as trained end-to-end with only image-level supervision. Experiments on four fine-grained benchmark datasets verified the effectiveness of our approach and validated its comparable performance to the state-of-the-art methods. Code is available at <https://github.com/cswluo/SEF>
READ FULL TEXT