Spatial self-attention network with self-attention distillation for fine-grained image recognition

03/23/2022
by   Adu Asare Baffour, et al.
0

The underlining task for fine-grained image recognition captures both the inter-class and intra-class discriminate features. Existing methods generally use auxiliary data to guide the network or a complex network comprising multiple sub-networks. They have two significant drawbacks: (1) Using auxiliary data like bounding boxes requires expert knowledge and expensive data annotation. (2) Using multiple sub-networks make network architecture complex and requires complicated training or multiple training steps. We propose an end-to-end Spatial Self-Attention Network (SSANet) comprising a spatial self-attention module (SSA) and a self-attention distillation (Self-AD) technique. The SSA encodes contextual information into local features, improving intra-class representation. Then, the Self-AD distills knowledge from the SSA to a primary feature map, obtaining inter-class representation. By accumulating classification losses from these two modules enables the network to learn both inter-class and intra-class features in one training step. The experiment findings demonstrate that SSANet is effective and achieves competitive performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro