1 Paper Summary
In recent years, Generative Adversarial Networks (GANs) have shown substantial progress in modeling complex distributions of data. These networks have received tremendous attention since they can generate implicit probabilistic models that produce realistic data using a stochastic procedure. While such models have proven highly effective in diverse scenarios, they require a large set of fully-observed training samples. In many applications access to such samples are difficult or even impractical and only noisy or partial observations of the desired distribution is available. Recent research has tried to address the problem of incompletely observed samples to recover the distribution of the data. (Zhu et al., 2017) and (Yeh et al., 2016) proposed methods to solve ill-posed inverse problem using cycle-consistency and latent-space mappings in adversarial networks, respectively. (Bora et al., 2017) and (Kabkab et al., 2018) have applied similar adversarial approaches to the problem of compressed sensing.
In this work, we focus on a new variant of GAN models called AmbientGAN, which incorporates a measurement process (e.g. adding noise, data removal and projection) into the GAN training. While in the standard GAN, the discriminator distinguishes a generated image from a real image, in AmbientGAN model the discriminator has to separate a real measurement from a simulated measurement of a generated image. The results shown by (Bora et al., 2018) are quite promising for the problem of incomplete data, and have potentially important implications for generative approaches to compressed sensing and ill-posed problems.
1.2 Proposed approach
The original Generative Adversarial Network proposed by (Goodfellow et al., 2014)
tries to map an easy-to-sample distribution (e.g. a low-dimensional Gaussian distribution) to a high-dimensional domain of interest such as images. As shown in Fig.1(a), the standard GAN is composed of a generator that attempts to produce a realistic image, and a discriminator unit that determines which images are real and which have been produced by the generator. In standard GAN, the dataset contains fully-observed samples without any measurement process such as noise or projection. One approach to deal with the dataset from incomplete or measurement distribution is to utilize the inverse of the measurement function and obtain the inverted samples (i.e. unmeasured samples) for training process. As shown in Fig. 1
(b), the estimated inverse samples for a given measurement are utilized to learn a generative model in order to approximate the distribution of real data. This model is called a baseline model in the original AmbientGAN paper, since some of the measurement models are not invertible, the authors utilize approximation for the invers function.
The AmbientGAN model adapts the original GAN configuration in an efficient way to handle cases in which the dataset consists of noisy or incomplete samples. The idea behind the AmbientGAN model is to apply the same measurement process to the output of the generator. As shown in Fig. 1(c), the discriminator in the AmbientGAN model has to differentiate the measurement of a generated image from the real measurement. Since the distribution of measured images uniquely determines the distribution of original images, training the AmbientGAN model results in a generator that produces images similar to real images without measurement.
In Fig. 1(c), let
be the latent variable vector with the distribution
, which can be a Gaussian or uniform distribution. Let generatorproduces generated samples with the distribution . The measurement function is parametrized by which has the distribution and outputs with the distribution . Since the desired samples are not available, we are given a set of IID samples from measurement distribution as the training dataset. As in the standard GAN, the objective function for the AmbientGAN model is a min-max scenario as described in:
where is the quality function (e.g. for the standard GAN). The goal of this objective function is to learn a generator such that is close to (since if and , then ).
Assumptions: One assumption is that the measurement function is known and it is easy to sample . In addition, is required to be a differentiable function with respect to its inputs for all , in order to have an end-to-end differentiable model for training. Another important assumption is that for a given observed measurement distribution , there is a unique true underlying distribution . As the result, if the discriminator is optimal, i.e. , then a generator is optimal iff . In other words, if the distribution of the measured generated images are close to the measured real images, the distribution of the generated images without measurement will be close to the distribution of the real images. The aforementioned assumption and its results Lemma is valid for certain measurement models, which are introduced in the original paper.
2.1 Motivation for the experiments
Since the AmbientGAN framework aims to model the distribution of the true data based on the available noisy and incomplete training samples, we are motivated to compare its effectiveness relative to the baseline model. As mentioned earlier, in the baseline model the training samples are first cleaned up using the inverse of the measurement function (or its approximation) and then they are delivered to the discriminator. We also favor to know the capability of the AmbientGAN model to recover the true underlying distribution in the presence of different measurement models. The measurement models used in the paper are listed as follows:
Block-Pixels: where each pixel is set to zero with probability.
Convolve-Noise: where images are convolved with a Gaussian kernel , and noise is added.
Keep-Patch: where pixels outside of a randomly chosen patch are set to zero.
Extract-Patch: where pixels within a randomly chosen are extracted.
Pad-Rotate-Project: where the padded image is rotated at a random angle about the center.
Pad-Rotate-Project-: where the chosen angle is also included in the measurement process.
Gaussian-Projection: where the image is projected onto a random Gaussian vector.
2.2 Reproducing the main results
In order to verify the results of the AmbientGAN model, we performed a set of experiments similar to the ones in the paper. For implementation, we took inspiration from two GitHub repositories 111https://github.com/shinseung428/ambientGAN_TF,222https://github.com/AshishBora/ambient-gan and modified the codes according to requirements for each experiment. The pseudo code to perform AmbientGAN training is illustrated in the Algorithm 1. In the following algorithm, after sampling from , the step of applying the measurement function to the generated samples is added to the standard GAN training.
We repeated the experiments in the paper on several datasets. For the celebA dataset, Block-Pixels, Convolve-Noise, Keep-Patch and Pad-Rotate-Project measurements were evaluated. For the CIFAR-10 dataset, Block-Pixels measurement experiment was experimented and for MNIST dataset Pad-Rotate-Project and Pad-Rotate-Project-
were applied. To run all mentioned experiments, we executed the code using Tensorflow library on a TITANX GPU, a GTX 1080 Ti GPU, a Tesla k80 GPU from Colab service of Google, and k20 GPUs from Guillimin compute cluster. Fig. 2 and 3 show the results of applying different measurements on celebA (with 35000 training iterations), CIFAR-10 datasets (with 25000 training iterations), respectively. The results for MNIST dataset (with 70000 training iterations) are presented in the next section.
As shown in Fig. 2, the AmbientGAN model is powerful enough to produce faces with acceptable visual quality, even though most of the pixels in the training samples are heavily degraded with Block-Pixels measurement (=0.95). The baseline model, on the other hand, does not show good results since the inverse of measurement function cannot perfectly clean the measured samples when dealing with severe measurement process. The images in the right side are our reproducing results that confirm the AmbientGAN results in the original paper.
The AmbientGAN also shows promising results for other datasets. As shown in Fig. 3, for CIFAR-10 dataset in the presence of Block-Pixels measurement with with =0.8, the AmbientGAN can produce samples similar to CIFAR-10 images with high quality while the baseline model fails to produce meaningful images. In addition to qualitative results, we also reproduced the experiments for the quantitative results using an inception model333http://download.tensorflow.org/models/image/ imagenet/inception-2015-12-05.tgz
trained on ImageNet dataset as shown in Fig.4.
As shown in Fig. 4, by increasing the blocking probability, the inception score of AmbientGAN model does not degrade quickly while for the baseline model and the standard GAN model which is trained with measured training samples (i.e. it ignores any measurement process), the inception score dramatically degrades. In addition, by increasing the training iteration, the inception score of the AmbientGAN model constantly improves and it is higher than other models with the same .
In this section, we first explore different aspects of the experiments and discuss the missing information in the paper. Then, we conclude the report by illustrating how the experiments are aligned with the proposed analysis.
Choosing Hyperparameters:The AmbientGAN structure consists of a generator and discriminator similar to the standard GAN model with an additional unit of measurement function . The authors choose the same implementation and hyper parameters as described in (Radford et al., 2015) and (Arjovsky et al., 2017), for the units which are common with the standard GAN. Therefore, the only unit that introduces new parameters in AmbientGAN is the measurement function. For each experiment, we change the values of the parameters to see the effect of measurement process on the AmbientGAN. Table 1 shows the selected hyper parameters and measurement parameters.
Failure cases and missing information: The AmbientGAN describes an approach to deal with noisy and incomplete training samples corrupted by some common measurement models such as noise and projection. However, the simulation results for Padding-Rotate-Project- does not demonstrate acceptable results for CelebA. As shown in Fig. 5, AmbientGAN only generates a general outline of the face without clear visualization of the elements inside the face. This implies that the AmbientGAN has difficulty to learn the complex distributions thorough 1D projection.
As shown in Fig. 5 (b) and (c), AmbientGAN can generate the digits with correct orientation, if the value of the theta is provided. However, even for the MNIST dataset which has simpler distribution compared with CelebA dataset, the AmbientGAN still has problems in producing readable digits. The paper also introduce the Gaussian projection as a measurement model but there is no discussion or simulation results about training AmbientGAN in the presence of this measurement. Reconstruction images from projection is a common technique in applications such as MRI and the success of AmbientGAN in this area is not fully understood from the original paper.
Another issue with AmbientGAN paper is missing the experiments about measurements in which the location information is lost. For example, there is no simulation results about Extract patch, to show the behavior of Ambient GAN in generating images when the training images are included some parts of faces without being in their natural locations.
Extra Experiments: The paper evaluates the AmbientGAN model for each measurement process separately. However, there might be more than one source of corruption on image in practice. Therefore, we applied both keep patch and block pixels, as two sources of measurements. Among different patch measurements, keep patch removes most part of the samples and its combination with block pixels introduce even severe corruptions on samples. Fig. 6, shows the simulation results for AmbientGAN in the presence of the combined measurements.
To perform the combined noise experiment, the bigger the block pixel probability is applied, the longer the training time is required so that the model can converge. As shown in Fig. 6, moreover, we can see the results for combining keep patch and block pixel with probability 0.5, 0.85 and 0.95, respectively. For the first two cases, the AmbientGAN can still effectively generated samples. Unfortunately, in case of probability of 0.95, AmbientGAN cannot produce a high visualization result, although, AmbientGAN can generate a well-observed image when we solely applied block pixel probability of 0.95 as shown in previous experiment.
As shown in Fig. 6 even with increasing the amount of distortion in the image, therefore, the AmbientGAN still can generate good faces. one interesting thing is that in almost all the experiments in CelebA, we have the location information. but we do not know what happens when the location information is missed. we are wondering why the authors did not report the results for extract patch since the location information is not available in that measurement. For pad-rotate, maybe one of the reasons bad results comes from the missing the location.
In the AmbientGAN paper, the authors tried a variety of measurement processes on different datasets to demonstrate that the proposed AmbientGAN generates better results than the baseline model. They reported some severe measurement process in which the training samples are degraded heavily but the AmbientGAN is still able to generate image perfectly. The author also provided theoretical analysis and proved the strong assumption of the paper that the distribution of measured images uniquely determines the distribution of original images. In this section, we explore different aspects of the experiments and discuss the missing information in the paper.
- Arjovsky et al. (2017) Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
- Bora et al. (2017) Ashish Bora, Ajil Jalal, Eric Price, and Alexandros G Dimakis. Compressed sensing using generative models. arXiv preprint arXiv:1703.03208, 2017.
- Bora et al. (2018) Ashish Bora, Eric Price, and Alexandros G Dimakis. Ambientgan: Generative models from lossy measurements. In International Conference on Learning Representations (ICLR), 2018.
- Goodfellow et al. (2014) Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
- Kabkab et al. (2018) Maya Kabkab, Pouya Samangouei, and Rama Chellappa. Task-aware compressed sensing with generative adversarial networks. arXiv preprint arXiv:1802.01284, 2018.
- Radford et al. (2015) Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- Yeh et al. (2016) Raymond Yeh, Chen Chen, Teck Yian Lim, Mark Hasegawa-Johnson, and Minh N Do. Semantic image inpainting with perceptual and contextual losses. arXiv preprint arXiv:1607.07539, 2016.
- Zhu et al. (2017) Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593, 2017.