Multi-Mapping Image-to-Image Translation with Central Biasing Normalization

06/26/2018 ∙ by Xiaoming Yu, et al. ∙ 0

Recent image-to-image translation tasks attempt to extend the model from one-to-one mapping to multiple mappings by injecting latent code. Based on the mathematical formulation of networks with existing way of latent code injection, we show the role of latent code is to control the mean of the feature maps after convolution. Then we find common normalization strategies might reduce the diversity of different mappings or the consistency of one specific mapping, which is not suitable for the multi-mapping tasks. We provide the mathematical derivation that the effects of latent code are eliminated after instance normalization and the distributions of the same mapping become inconsistent after batch normalization. To address these problems, we present consistency within diversity design criteria for multi-mapping networks and propose central biasing normalization by applying a slight yet significant change to existing normalization strategies. Instead of spatial replicating and concatenating into the input layers, we inject the latent code into the normalization layers where the offset of feature maps is eliminated to ensure the output consistency for one specific mapping and the bias calculated by latent code is appended to achieve the output diversity for different mappings. In this way, not only is the proposed design criteria met, but the modified generator network has much smaller number of parameters. We apply this technique to multi-modal and multi-domain translation tasks. Both quantitative and qualitative evaluations show that our method outperforms current state-of-the-art methods. Code and pretrained models are available at https://github.com/Xiaoming-Yu/cbn.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 7

page 8

page 10

page 11

Code Repositories

cbn

Pytorch implementation of Multi-Mapping Image-to-Image Translation with Central Biasing Normalization


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Many image processing and computer vision problems can be framed as image-to-image translation tasks

[isola2017pix2pix]

, such as facial attributes transform, grayscale image colorization and edge maps to photos. This can also be viewed as mapping an image from one specific domain to another. Although many studies have shown remarkable success in image-to-image translation between two domains, supervised image translation 

[isola2017pix2pix, zhang2016colorful] and unsupervised image translation [CycleGAN2017, Yi2017DualGAN, kim2017disco, liu2017UNIT], these one-to-one mapping methods are not suitable for multi-mapping problems. In these methods, the generated model tries to learn a specific mapping from the source domain to the target domain. It means that different models need to be built for different pairs of mapping, even though some mappings share common semantics. To address this problem, the multi-mapping image-to-image translation has received more and more attention [lample2017fader, choi2017stargan, zhu2017multimodal]. In these methods, latent code is introduced to indicate different mappings, and applied to multi-domain translation tasks [lample2017fader, choi2017stargan] and multi-modal translation tasks [zhu2017multimodal].

[height=3.6cm]domain_modal

Figure : An illustration of multi-mapping indicated by latent code . (a) Multi-domain translation indicated by limited domain latent code. (b) Cross-domain translation indicated by potential attribute latent code.

[height=