Learning Spatial Pyramid Attentive Pooling in Image Synthesis and Image-to-Image Translation

01/18/2019
by   Wei Sun, et al.
8

Image synthesis and image-to-image translation are two important generative learning tasks. Remarkable progress has been made by learning Generative Adversarial Networks (GANs) goodfellow2014generative and cycle-consistent GANs (CycleGANs) zhu2017unpaired respectively. This paper presents a method of learning Spatial Pyramid Attentive Pooling (SPAP) which is a novel architectural unit and can be easily integrated into both generators and discriminators in GANs and CycleGANs. The proposed SPAP integrates Atrous spatial pyramid chen2018deeplab, a proposed cascade attention mechanism and residual connections he2016deep. It leverages the advantages of the three components to facilitate effective end-to-end generative learning: (i) the capability of fusing multi-scale information by ASPP; (ii) the capability of capturing relative importance between both spatial locations (especially multi-scale context) or feature channels by attention; (iii) the capability of preserving information and enhancing optimization feasibility by residual connections. Coarse-to-fine and fine-to-coarse SPAP are studied and intriguing attention maps are observed in both tasks. In experiments, the proposed SPAP is tested in GANs on the Celeba-HQ-128 dataset karras2017progressive, and tested in CycleGANs on the Image-to-Image translation datasets including the Cityscape dataset cordts2016cityscapes, Facade and Aerial Maps dataset zhu2017unpaired, both obtaining better performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset