PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"
Convolutional neural networks have allowed remarkable advances in single image super-resolution (SISR) over the last decade. Among recent advances in SISR, attention mechanisms are crucial for high performance SR models. However, few works really discuss why attention works and how it works. In this work, we attempt to quantify and visualize the static attention mechanisms and show that not all attention modules are equally beneficial. We then propose attention in attention network (A^2N) for highly accurate image SR. Specifically, our A^2N consists of a non-attention branch and a coupling attention branch. Attention dropout module is proposed to generate dynamic attention weights for these two branches based on input features that can suppress unwanted attention adjustments. This allows attention modules to specialize to beneficial examples without otherwise penalties and thus greatly improve the capacity of the attention network with little parameter overhead. Experiments have demonstrated that our model could achieve superior trade-off performances comparing with state-of-the-art lightweight networks. Experiments on local attribution maps also prove attention in attention (A^2) structure can extract features from a wider range.READ FULL TEXT