Structured GANs

01/15/2020 ∙ by Irad Peleg, et al. ∙ 92

We present Generative Adversarial Networks (GANs), in which the symmetric property of the generated images is controlled. This is obtained through the generator network's architecture, while the training procedure and the loss remain the same. The symmetric GANs are applied to face image synthesis in order to generate novel faces with a varying amount of symmetry. We also present an unsupervised face rotation capability, which is based on the novel notion of one-shot fine tuning.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 6

page 7

page 8

page 9

page 10

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Symmetry is a prominent property that has gained attention in face recognition 

[16, 1, 5] and other applications [15, 7]. Considering symmetry in the context of image generation with GANs, the core research questions that we ask in this work are: (1) how to control the symmetry of generated images and (2) how to rotate an object, when the training set did not contain the relevant supervision.

Question number one is answered by proposing two alternative architectures for generative networks. In the first architecture, the first few elements of the input vector serve as the antisymmetric component, the others serve as the symmetric component. In the second architecture, the generated image is symmetric, if the input vector has a palindrome structure, i.e., remains unchanged when flipping the order of elements. Both architectures are shown to work much better than an approach in which the loss is used in order to control the symmetry property. Fig. 

2 illustrates how is converted to in the symmetric GANs.

The second question is answered by a process in which the generator network is adapted in order to generate a specific face. This powerful technique for preserving a given identity in the generated images is a general one and can be applied to many other GAN-based methods.

As far as we know, we are the first to manipulate the structure of the generator in order to enforce properties on the generated image. Due to space constraints, we have placed in the appendices, a different structure manipulation that creates tilable textured patches. Therefore, the core idea of our work also applies to completely different tasks.

(a) (b)
Figure 1: (a) An illustration of the z’ architecture. The top plot depicts a random instance of , which passes through a generator network to create a face image. The bottom plot shows the vector after the first five elements multiplied by . Passing it through the generator of the architecture, results with a mirrored version. (b) An illustration of the symmetric GAN flip architecture. The top plot depicts a random instance of , which passes through a generator network to create a face image. The bottom plot shows which is left-right flipped. The application of the generator of the flip architecture on this vector results with a mirrored face image.
Figure 2: An illustration of the relationship between and for the two proposed methods. (Left) In the first method, called the architecture, part of the GAN’s input encodes symmetry. When this part is zero, the face is symmetric, and when is negated, the image is mirrored along the y axis. (Right) In the flip architecture, the image is symmetric, when the flipped version of , which we denote as satisfies .

1.1 Related Work

The task of generating realistic looking images has challenged computer vision for a long time. Recently, a major leap has been made with the development of the Generative Adversarial Network (GAN) 

[10]. These architectures employ two networks and , which provide training signals to each other. The network tries to distinguish between the “fake” images generated by and the real images provided as a training set. Network tries to fool and creates images that look as realistic as possible.

The specific architecture that we employ is based in part on the DC-GAN [17]

method. This architecture uses deconvolution and batch normalization in order to create attractive outputs. Specifically, the input vector in DC-GAN is a

vector, whose elements are sampled i.i.d from a uniform distribution. In our case, we encode symmetry into this vector, and through the use of specific architectures, we enforce the generated output

to display the required level of symmetry.

Through our focus on symmetry, our networks are able to extract the notion of yaw. GAN-based ways to extract semantic regularities in an unsupervised manner, include InfoGAN [6] and Fader networks [13].

2 Symmetric GANs

We present two architecture-based methods, which differ in the manipulation that the input undergoes in order to create a mirrored version, see Fig. 1. We also present a loss-based method to serve as a baseline. The difficulty in training this loss-based method successfully, emphasizes the effectiveness of the architecture-based methods.

2.1 The Symmetric Architectures

Our GANs are symmetric end-to-end, including both the generator and the discriminator .

Both and contain convolutional layers, as well as fully connected ones. In order to maintain symmetry, both these layer types are augmented. The fully connected parts are handled differently in and in . We also present two architectures for , which differ exactly in the way in which the fully connected layers are constructed. The convolutional layers are treated exactly the same in all variants of and in and in all of these cases the same symmetric kernels are introduced. The flow of the symmetric generator in presented in Fig. 3.

Figure 3: Flow of symmetric generator.

The two alternative architectures differ in the very first layer. The first architecture creates symmetric images for inputs in which the first part is zero. The second architecture produces symmetric outputs for inputs which are themselves symmetric, i.e., , where the flip operator switches the first element with the last one, the second element with the one before the last and so on.

2.1.1 The Generator of the z’ Architecture

The first generated architecture splits the input vector into two parts. The first, is the anti-symmetric part, while the second is the symmetric part. For input vectors that have , the output is completely symmetric.

The architecture that ensures the symmetry and anti-symmetric property enforces this structure on the first feature map, and maintains it thereafter by employing symmetric convolutional kernels.

The generation of the first feature map is depicted in Fig. 4. The random input vector , which contains 100 i.i.d uniformly distributed elements, is split into two parts. The first part is mapped through a fully connected layer (affine transformation) to a vector of size 5120. This vector is then reshaped into 512 kernels of size , which are transformed to antisymmetric kernels of size by taking the last column to be the negative of the first column and the column before, to be the negative of the second column, see Fig. 5(c).

The remaining 95 elements of , denoted by are mapped to a vector of size , which is then reshaped to 512 kernels of size . By performing the symmetric reflection depicted in Fig. 5(b), i.e., copying the first column to the fifth column and the second to the fourth column, kernels are obtained.

The rest of the network follows the DC-GAN architecture, except that the kernels of parameters each, are replaced with symmetric kernels that contain only 15 different parameters.

Figure 4: Symmetric GAN G flow to first feature map decomposed to symmetric and antisymmetric components
Figure 5: A comparison of four kernel types. (a) The standard kernel or feature map. (b) The symmetric kernel and feature map. (c) The anti-symmetric feature map. (d) Symmetric folding of a feature map.

Fig. 5(c) shows an antisymmetric reflection of the symmetry break part on the first feature map of G.

2.1.2 The Generator of the flip Architecture

In the alternative symmetric architecture, the generator produces symmetric images, when for a given input , it happens that . Here, too, the symmetric kernels throughout the layers make sure that a symmetric feature map from one layer, leads to a symmetric feature map in the next.

The architecture of the first layer is depicted in Fig. 6. The same matrix is applied as weights of a fully connected layer, to both and , to obtain vectors of length 12,800. These are reshaped to 512 feature maps of size . The feature map that results from the flipped vector is being flipped left and right (the first columns is replaced with the fifth and second with the fourth). The matching pairs, one from each branch of the network, are then summed.

Given that , the two branches produce identical feature maps before the left-right flipping and after flipping, thus symmetry is obtained.

Figure 6: symmetric GAN G flow to first feature map as symmetric mapping from a vector to an image.

2.1.3 The Symmetric Discriminator Architecture

A desired property in our framework is that the discriminator

would return the same probability of “real” for an image and its mirrored version. This is not the case in conventional architectures, and, for example, during training the conventional discriminator would overfit on the training sample, while fails on its mirror image, see Fig. 

7.

We, therefore, enforce symmetry by using a specific architecture. First, we replace the discriminator of the DCGAN with one that has symmetric left-right kernels, similar to what we have used in the generators. For the last layer, we then obtain a feature map of size . This feature map undergoes a symmetric folding, as depicted in Fig. 5(d). Namely, the first (second) column is replaced by the sum of this column and the fifth (fourth) column. In addition, the third column is doubled. The result can be seen as a vector the size of . Through a fully connected layer, a single output is then obtained. The succession of symmetric kernels and the symmetric folding, ensure that mirror images obtain the same score by . The flow of the symmetric discriminator is shown in Fig. 8.

Figure 7:

The output probability of the “real” label of the classifier

for the training samples and their mirrored image, when the classifier does not use the symmetric invariant architecture of Sec. 2.1.3. There is a clear overfitting on the training data. However, there is also a great uncertainty about the flipped version.
Figure 8: Flow of symmetric discriminator.

2.2 Alternative Loss Based Method

In addition to the architecture based approaches, we evaluated baseline methods for which the symmetry is enforced by adding a loss term. Multiple experiments were conducted, each with varying emphasis (weight) of the added loss term. The overall conclusion is that such training is highly unstable and that the model tends to collapse and either ignore the term (when the weight is low), or result in symmetric images for every input (when the weight is high).

As above, the vector encodes symmetry either by having the first five elements as zero, or by having the vector invariant to element flipping.

The loss term we add is based on pairs of inputs . For the based symmetry encoding, we create pairs , where is identical to , except that the first elements are replaced by . For the flipping based symmetry encoding, we take . The loss term used aggregates over all such pairs , for a weight and where mirror flips the order of the columns of the image.

3 Processing an Existing Image

In order to manipulate an existing image , one needs to first recover the “underlying” vector by employing a reconstruction loss. Employing this vector in order to recover and rotated versions of it, suffers from noticeable reconstruction errors. Most notably, the reconstructed face does not maintain the identity of the image , despite their similarity. The same phenomenon is also apparent when observing the approximation results of the most recent generation schemes, such as BEGAN [3] (see Fig. 9).

Figure 9: Optimizing vector to fit the top input image in BEGAN [3] (used with permission). The generated images look realistic and somewhat similar. However, identity is not preserved.

We therefore, propose to fine-tune using the same loss, while focusing on the recovered . By doing so, we obtain an image specific network that is able to generate the input image , but is no longer as general as .

We assume a symmetric generator that employs the method. Having and , we are then able to alter the amount of symmetry by modifying . First, the mirror image is generated by employing

(as above). The spectrum in between the image and the mirror image, which provides a virtual yaw effect, is then spanned by interpolating between

and . The results are shown in Fig 14(d). As shown in the experiment, the results are superior to those obtained using the unmodified generator .

1:procedure  FitAndTune
2:I. Solve the following optimization problem:
3:   
4:II. Alternate between the following:
5:
  1. Symmetric GAN training for and

  2. , where are initialized as the current

Algorithm 1 Recovering the underlying vector for an input image and modifying the generator in order to obtain a version such that

The process of recovering of is illustrated in Algo. 1. First, we iteratively optimize to minimize the following term: . We found it necessary to employ weight decay on and also to use a hinge-loss to encourage the values to remain in the range .

In the second phase, we allow to be optimized as well, creating a version that is tailored to the specific sample . In order to prevent from becoming degenerate and too specific to the problem of reconstructing , we alternated between iterations that are performed on the training data that was used for training and between optimizing and to minimize the reconstruction risk.

4 Experiments

We present empirical evaluation results for both types of structured GANs studied: and flip.

4.1 Applying Symmetric GANs to Face Images

For evaluating the symmetric GAN methods, we have compared the following methods:

  1. A simple DC-GAN with no symmetric properties.

  2. DC-GAN with soft symmetric loss term: .

  3. DC-GAN with strong symmetric loss term: .

  4. Our Symmetric GAN using the Architecture.

  5. Our Symmetric GAN using the flip Architecture.

In each method, we generated a series of nine images while attempting to enforce symmetry on it, so that the first image will be a mirror of the last, the second of the one before the last and so on. Since the number of images is odd, the middle image is expected to be symmetric to itself. The way of obtaining the symmetry is determined by the method and follows the description in Fig. 

2.

Sample results are presented in Fig. 10. As can be seen, DC-GAN creates high quality images but had no symmetric effect as expected. DC-GAN with soft symmetric loss was not strong enough to enforce symmetry over the generated image. However, on the other hand, it created light deformation to the image caused by unstable training. DC-GAN with a strong symmetric loss was very unstable during training. The generated images were mostly symmetric to themselves and with bad quality. The results of both of our symmetric training techniques were much more convincing and presented the desired effect.

Figure 10: Series of images generated by various GAN methods, trying to obtain a mirroring effect between images on the left and right side of each series. (a) DC-GAN. The first 5 elements of the 100D vector change, but this has little effect on the output image. (b) DC-GAN with a soft symmetric loss term, using the z’ encoding of symmetry.(c) DC-GAN with a strong symmetric loss term and using the z’ encoding of symmetry. (d) Symmetric-GAN using the architecture.(e) Symmetric-GAN using the flip architecture.

We then averaged each generated image with the corresponding image on the other side of the series, after the 2nd image was mirrored. If the mirroring effect is exact, no artifacts are expected. The results can be seen in Fig. 11. This visualization clearly demonstrates that both our methods ( and flip) create mirror images when the input dictates this.

Figure 11: Same as Fig 10 where symmetry is further evaluated by averaging the image with the mirror image of , where is manipulated to generate the mirror image, i.e., in each location we present . (a) DC-GAN. (b) DC-GAN with a soft symmetric loss term, z’ encoding of symmetry.(c) DC-GAN with a strong symmetric loss term, z’ encoding. (d) symmetric-GAN using the Architecture. (e) Symmetric-GAN using the flip Architecture.
Figure 12: Using the architecture, is multiplied by a scalar, shifted from [-0.5,+0.5] with leaps of 0.1 (left to right). For each row, is held constant. Each row is a different experiment, producing 11 images.

Finally, we measure the MSE between each image and the mirror version of it. The results are shown in Fig. 7 in the appendices. As can be seen, the proposed methods drop to nearly 0 in the middle image, indicating that those images are symmetric. We can see that the MSE of the other methods is relatively constant and does not drop to zero. The loss-based method, with the strong symmetric constraints creates images that are symmetric throughout the range of values. An even stronger symmetry loss would lead to an MSE close to zero along the entire curve, with an image that is barely recognizable as a face.

Manipulating a Face Image

In order to manipulate a given image , we have recovered the vector that best matches the image and then manipulated it, as explained in Sec. 3. Fig. 13 depicts the results obtained by recovering this vector and then generating images from manipulated versions. There are noticeable artifacts. These artifacts are largely reduced when performing the per image tuning of in order to obtain , as can be seen in Fig. 14.

(a)
(b)
(c)
Figure 13: The results obtained without transfer. (a) Sample dataset images . (b) for the vector that was optimized to minimize the reconstruction loss. Results are shown for a symmetric generator that is based on the architecture. (c) Rotations performed using and manipulated versions of .
(a)
(b)
(c)
(d)
Figure 14: (a) Dataset samples . (b) for the vector that was optimized to minimize the reconstruction loss. Results are shown for a symmetric generator that is based on the architecture. (c) , where is finetuned from in order to further minimize the reconstruction error. (d) Rotations performed using and manipulated versions of .
Figure 15: Using the architecture, all five fields, except one, are set to zero. The remaining field is shifted from [-1,+1] with leaps of 0.2 (left to right). In each row, a different field is changed. All 95 fields are unchanged during the experiment. Each experiment produce 5x11 images. The following experiment has been repeated three times.

4.2 Symmetrical Views of Man-Made Scenes

To show that our method is general, the network was also trained on the LSUN bedrooms dataset [22]. Unlike a face, a bedroom is not symmetric. However, since mirror images of rooms belong to the same class, the method fits this kind of data well.

In our experiments, we focused on the architecture. The first experiment shows how a generated image is affected when setting its component closer or further away from zero. The results, depicted in Fig. 12, show that the closer to zero, the more symmetric the generated image is, and that the image associated with and that associated with are mirror images. The second experiment is similar, with the single change that we fix everywhere except for one coordinate at a time that is changed. This way, we can study the effect of each individual dimension on the output. The results are shown in Fig. 15. It is clear that each dimension controls a different mode of variability. However, the dimensions are not independent and the same objects emerge by using different coordinates.

5 Conclusions

DC-GANs are being used today for a wide range of applications, such as domain transfer networks [19], photo editing [4], denoising, data creation and more. We demonstrate how by manipulating the structure of the generator, we can directly control the symmetry of the output. A second, completely different, application to tiling, which is presented in the appendices, shows that a similar structure modifying design provides a solution for a completely different application.

Acknowledgements

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant ERC CoG 725974).

We are grateful to Barak Itkin for proposing the tiling application.

References

  • [1] Y. Adini, Y. Moses, and S. Ullman (1997) Face recognition: the problem of compensating for changes in illumination direction. TPAMI 19 (7), pp. 721–732. Cited by: §1.
  • [2] U. Bergmann, N. Jetchev, and R. Vollgraf (2017) Learning texture manifolds with the periodic spatial gan. arXiv preprint arXiv:1705.06566. Cited by: §A.1.
  • [3] D. Berthelot, T. Schumm, and L. Metz (2017) BEGAN: boundary equilibrium generative adversarial networks. arXiv preprint arxiv:1703.10717. Cited by: Figure 9, §3.
  • [4] A. Brock, T. Lim, J. M. Ritchie, and N. Weston (2016) Neural photo editing with introspective adversarial networks. arXiv preprint arXiv:1609.07093. Cited by: §5.
  • [5] R. Brunelli and T. Poggio (1993) Face recognition: features versus templates. TPAMI 15 (10), pp. 1042–1052. Cited by: §1.
  • [6] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In NIPS, Cited by: §1.1.
  • [7] T. Cohen and M. Welling (2016) Group equivariant convolutional networks. In

    International Conference on Machine Learning

    ,
    pp. 2990–2999. Cited by: §1.
  • [8] L. A. Gatys, A. S. Ecker, and M. Bethge (2015) A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576. Cited by: §A.1.
  • [9] L. A. Gatys, A. S. Ecker, and M. Bethge (2015)

    Texture synthesis using convolutional neural networks

    .
    In NIPS, Cited by: §A.1.
  • [10] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio (2014) Generative adversarial nets. In NIPS, pp. 2672–2680. Cited by: §1.1.
  • [11] N. Jetchev, U. Bergmann, and R. Vollgraf (2016) Texture synthesis with spatial generative adversarial networks. arXiv preprint arXiv:1611.08207. Cited by: §A.1.
  • [12] J. Johnson, A. Alahi, and L. Fei-Fei (2016)

    Perceptual losses for real-time style transfer and super-resolution

    .
    In European Conference on Computer Vision, pp. 694–711. Cited by: §A.1.
  • [13] G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, and M. Ranzato (2017) Fader networks: manipulating images by sliding attributes. arXiv preprint 1706.00409. Cited by: §1.1.
  • [14] C. Li and M. Wand (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In European Conference on Computer Vision, pp. 702–716. Cited by: §A.1.
  • [15] S. Liang, J. Chen, Z. Li, and G. Bai (2016) Linear time symmetric axis search based on palindrome detection. In ICIP, pp. 1799–1803. Cited by: §1.
  • [16] A. Pentland, B. Moghaddam, T. Starner, et al. (1994)

    View-based and modular eigenspaces for face recognition

    .
    In CVPR, Vol. 94, pp. 84–91. Cited by: §1.
  • [17] A. Radford, L. Metz, and S. Chintala (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint 1511.06434. Cited by: §A.2, §1.1.
  • [18] K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §A.1.
  • [19] Y. Taigman, A. Polyak, and L. Wolf (2016) Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200. Cited by: §5.
  • [20] D. Ulyanov, A. Vedaldi, and V. Lempitsky (2017) Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In CVPR, Cited by: §A.1.
  • [21] D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky (2016) Texture networks: feed-forward synthesis of textures and stylized images. In Int. Conf. on Machine Learning (ICML), Cited by: §A.1.
  • [22] F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao (2015)

    Lsun: construction of a large-scale image dataset using deep learning with humans in the loop

    .
    arXiv preprint arXiv:1506.03365. Cited by: §4.2.

Appendix A Generating Tiles

As a second application for manipulating the structure of GANs, we present methods for creating tiles that can be arranged repeatedly in 2D in a variety of predetermined patterns. Just like symmetry, tiling enforces a specific structure on the output. For example, in the case of simple tiling, where tiles are being placed in the same orientation on a grid, the top (left) part of the tile should merge smoothly with the bottom (right) part.

a.1 Previous Work On Texture Synthesis

Gatys [9] demonstrated how to capture texture properties from a given image and generate new images with the same texture properties. The descriptor is based on a pre-trained network, usually VGG [18]. A GRAM matrix is extracted from feature maps of certain layers. The objective compares the descriptors of the target image to those of the source image.  [8, 20] perform style transfer by combining a content loss from a feature map of a deep layer of VGG. Later on, works such as  [12, 21, 14, 11, 2] and others, showed how to train generative networks that are able to simultaneously generate images with texture properties that were already embedded in the train process. Works like  [14, 11, 2] do so as a GAN implementation.

In contrast to previous work, we focus on tiles and not on the textured image. This allows us to develop GANS that create tiles for complex tiling patterns.

a.2 An Architecture for Generating Tiles

The idea of enforcing structure by constructing a suitable architecture, as opposed to modifying just the loss, extends beyond symmetry to the problem of tiling. The input to the tiling problem is an image of some texture. The goal is to synthesize a patch that:

  1. Has texture properties that are indistinguishable from those of patches from the source image .

  2. Has a periodic structure such that when the patch is concatenated to itself, there is no texture discontinuity in the boundary.

The most basic tiling pattern repeats each tile, as is, in multiple columns and rows. However, as Fig. 16 illustrates, there are many alternative patterns in which the patterns might be rotated or placed in more complex patterns.

Figure 16: The four forms of tiling presented in this paper. From left to right: (simple rectangular lattice, real projective plane topology, spherical topology, hexagonal lattice)

As in symmetry, we employ a modified version of the generator of the DC-GAN method [17] in order to transform a random vector into a patch image, in this case of size . Unlike the symmetry encode case, in which the vector encodes whether the output image is symmetric or not, for tiling, we expect all outputs to maintain the two desired properties and is completely random.

Since it is the texture properties of the patch that we are concerned with, we encode the patch using the GRAM matrix extracted from the generated image as well as from all layers of

, right after the convolution, and before adding the bias, performing batch normalization and applying ReLU. Specifically,

where denotes the feature map of layer . A virtual layer of ones is added in order to capture first order statistics and the size of the GRAM matrix computed for layer is, therefore, , where is the number of filters in this layer. All GRA matrices are then normalized by the value .

All GRAM fields from all the layers of are concatenated to one descriptor, which is fed to the fully connected part of . At each batch, crops out of of size are used as the “real” samples and generated samples of the same size are used as the “fake” sample. The architecture of for capturing textures is depicted in Fig. 18.

(a)
(b)
Figure 17: The architecture of for texture synthesis.
Figure 18: The tile and random crop method as applied to (a) Vanilla tiling on a grid. (b) tiling the spherical topology.
Figure 17: The architecture of for texture synthesis.

We propose two different tiling GAN methods. The first employs cyclic deconvolutions and the second tiles and crops.

Cyclic deconvolution

In order to support horizontal tiling, for example, it is necessary to have the leftmost part of the patch similar to the rightmost part. This is enforced by replacing the deconvolution blocks of with cyclic deconvolution blocks, in which the convolutions support extend beyond the edges of the feature map and warp back to the other end of the map. This is done for all layers of . Note, that for complex tiling patterns, the cyclic deconvolution take more complex forms (see below).

Tile and randomly crop

In this method, it is the discriminator that enforces the tiling property. This is done by taking the generated image, tiling it in the plane and cropping a patch from the result. This patch is then fed to

. If there are tiling artifacts in the crop, the discriminator will then pick up on these. During backpropagation, G is being augmented in a way that reduces the artifacts and learns the tiling pattern implicitly. See Fig. 

18.

a.3 Tiling experiments

We first present, in Fig. 19, the results obtained for the simple grid tiling. As can be seen, tiling using tiles generated by the baseline DC-GAN leads to noticeable artifacts at the boundaries of the tiles, while either one of the two methods we propose avoids these artifacts.

Figure 19: A comparison of the tiling methods. (a) real texture. (b) tiling outputs of DC-GAN. (c) the outcome of tiling with the cyclic deconvolution method. (d) the outcome of tiling with the tile and crop method.

We further experimented with less conventional tiling approaches. The results are shown in Fig. 20. The proposed methods perform well, except that the cyclic convolution method is not appropriate for the spherical topology, since it requires the conversion of a row to a column and vice versa.

Figure 20: Row 1 shows tiling of a real texture. Row 2 shows tiling using the circular convolution method. Row 3 shows tiling of using random crop method. Column a shows tiling in a hexagonal pattern. Column b shows tiling in pattern of real projective plane topology. Column c shows tiling in pattern of spherical topology.

A closer look at the various artifacts can be observed in Fig. 21.

(a)
(b)
(c)
Figure 21: a collection of three repeatable artifacts observed during tiling experiments. (a) a noise texture that appears in some cases of tiling using the random crop method. (b) hexagonal tiling with the tile and crop method results in a constant tile, here each tile has a different and yet all tiles are the same. (c) a discontinuity phenomenon typical for cyclic deconvolution combined with spherical topology.

Appendix B MSE Plot for Symmetric GANs

We measure the MSE between each image and the mirror version of it. The results are shown in Fig. 22. As can be seen, the proposed methods drop to nearly 0 in the middle image, indicating that those images are symmetric to themselves. We can see that the MSE of the other methods is relatively constant and does not drop to zero. The loss-based method, with the strong symmetric constraints creates images that are symmetric throughout the range of values. An even stronger symmetry loss would lead to an MSE close to zero along the entire curve, with an image that is barely recognizable as a face.

Figure 22: The MSE difference between a generated image and the mirrored image of , where is the vector that is supposed to generate the mirrored image, i.e., for the loss based method and the z’ symmetric GAN, the first five coordinates of are the negative of the first five coordinates of .