Message Passing Multi-Agent GANs

12/05/2016
by   Arnab Ghosh, et al.
0

Communicating and sharing intelligence among agents is an important facet of achieving Artificial General Intelligence. As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have recently been shown to be very effective for image generation and other tasks, these networks have been limited to mostly single generator-discriminator networks. We show that we can obtain multi-agent GANs that communicate through message passing to achieve better image generation. The objectives of the individual agents in this framework are two fold: a co-operation objective and a competing objective. The co-operation objective ensures that the message sharing mechanism guides the other generator to generate better than itself while the competing objective encourages each generator to generate better than its counterpart. We analyze and visualize the messages that these GANs share among themselves in various scenarios. We quantitatively show that the message sharing formulation serves as a regularizer for the adversarial training. Qualitatively, we show that the different generators capture different traits of the underlying data distribution.

READ FULL TEXT VIEW PDF

Authors

page 2

page 8

page 9

page 10

page 11

page 12

page 13

04/30/2019

Cross-Modal Message Passing for Two-stream Fusion

Processing and fusing information among multi-modal is a very useful tec...
04/10/2017

Multi-Agent Diverse Generative Adversarial Networks

This paper describes an intuitive generalization to the Generative Adver...
02/28/2021

Training Generative Adversarial Networks in One Stage

Generative Adversarial Networks (GANs) have demonstrated unprecedented s...
06/15/2021

Minimizing Communication while Maximizing Performance in Multi-Agent Reinforcement Learning

Inter-agent communication can significantly increase performance in mult...
08/30/2019

Systematic Analysis of Image Generation using GANs

Generative Adversarial Networks have been crucial in the developments ma...
08/15/2019

Straggling for Covert Message Passing on Complete Graphs

We introduce a model for mobile, multi-agent information transfer that i...
06/24/2020

Deep Convolutional GANs for Car Image Generation

In this paper, we investigate the application of deep convolutional GANs...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Unsupervised learning has emerged as one of the most important facets of Machine Learning research. With the advent of Generative Adversarial Networks (GANs) (goodfellow2014generative) it has become possible to harness large amounts of unlabeled data in the form of a generative model which can make extremely plausible images (radford2015unsupervised). If we are to target superhuman intelligence, we have to create networks that not only learn from large quantities of data but also interact among themselves in order to learn from each other or even compete with each other. Through this work we obtain one of the first approaches towards engaging multiple agents towards learning of deep unsupervised representations. Note that very recently multiple agents have been explored by sukhbaatar2016learning and foerster2016learning where they employ a Deep Reinforcement based formulation in order to achieve a shared utility.

Generative Adversarial Networks have recently seen applications in Image Inpainting (

pathak2016context), Interactive Image Generation from just a few brushstrokes (zhu2016generative

), Image Super Resolution (

ledig2016photo) and Abstract Reasoning Diagram Generation (ghosh2017contextual). GANs have been augmented in several ways to extract structure out of the representations most notably by chen2016infogan, liu2016coupled and dumoulin2016adversarially. While our Multi-Agent Generator based framework has the elegance to be applicable in most of the above applications, we demonstrate its application in the task of unsupervised image generation.

Figure 1: Generations of Generator 1 with uniform(-1,1) noise distribution with conditioned message passing. It captures detailed facial expression.
Figure 2: Generations of Generator 2 with N(0,1) noise distribution with conditioned message passing. It captures smooth features of facial expression.
Figure 3: Generations look as if showing the process of artist creation.

Our work bears close resemblance to the work on Adversarial Neural Cryptography (abadi2016learning) where the Cryptographic System is automatically learned based on the varying objectives of the three agents Alice, Bob and Eve. Our conceding and competing objectives are based on these ideas. Multi-Agent Systems with message passing were first employed by foerster2016learning and our model of harnessing the messages that are received from the other generator is based on similar ideas. lazaridou2016multi introduced a message passing model between the different agents that are forcibly made to co-operate via the introduction of a bottleneck and the clustering of the image features show that messages for images of same category are usually same.

This work presents one of the first forays into this subject and in a more traditional Deep Unsupervised Learning setting rather than in a Deep Reinforcement Setting where the reward structure is discrete and training becomes slightly more difficult. In this work we present a setting of Multi-Generator based Generative Adversarial Networks with a competing objective function which promotes the two generators to compete among themselves apart from trying to maximally fool the discriminator. We also analyze a conceding objective which tries to promote the other generator to be better than itself. We also introduce a message passing model in order to make the generators aware of the generations the other generator is targeting and hence learn to generate better images.

With the message passing model, a fact emerged that a bottleneck has to be added in order to make the message generator actually learn meaningful representations of the messages. Hence we demonstrate the performance of the message passing model in presence of three bottlenecks. The first being the two generators being passed samples from different noise distributions, namely one of them was provided with samples from Normal(0,1) and the other one was passed samples from Uniform(-1,1). The message passing model was also analyzed with the two objectives introduced as competing objective and the conceding objective to understand the message and the generations that each of the networks produce in such situations.

The models yielded some interesting results as seen from Fig. 1 and Fig. 2 without any explicit formulation one of the generators was generating images with much more facial detail while the other generator was generating images with the overall content with even obscure objects as the last image in Fig. 2 where a woman wearing a cap with her eyes covered by the cap is depicted. An interesting observation was in Fig. 3

where the message interpolation results between the 2 generators showed the process an artist takes for an artistic creation.

In summary our main contributions in the paper are:

  • Presenting a novel framework of Multi-Agent GANs that comprises of multiple generators.

  • Introducing an objective which promotes competition among the generators and another objective which tries to make the other generator better than the current generator.

  • Introducing a novel message passing model, with the messages being passed between the generators in order to better explore the modes in the distribution.

2 Related Work

Unsupervised Learning with Generative Models have made immense progress within a remarkably short time, most notably being pioneered by 2 major directions Variational Autoencoders (

kingma2013auto) and Generative Adversarial Networks (goodfellow2014generative). Efforts have been made for the unification of the 2 methods using Adversarial Autoencoders (makhzani2015adversarial). Since the Variational Autoencoder based models are based on a Maximum Likelihood based objective hence some of the modes may remain unexplored.

Gan

Generative Adversarial Networks (GANs) have received tremendous interest in recent times especially after radford2015unsupervised were able to show several interesting interpolation based generations and even arithmetic properties that exists in the latent space. Several applications such as video generation (vondrick2016generating), Image manipulation (zhu2016generative) and 3-D object generation (wu2016learning) use GANs as the underlying generative model. Several variants of the GAN training objective have also been proposed in order to stabilize the training such as salimans2016improved and arjovsky2016towards. Several objective functions have been proposed which minimize a divergence different from the Jensen Shannon divergence as proposed by goodfellow2014generative for instance nowozin2016f experiment with various different divergences and show improved results.

Conditional GANs

Our technical approach is closely related to the Conditional GANs of mirza2014conditional which generate images based on class specific information, reed2016generative which condition the generation on the text, ghosh2017contextual which condition the generation on all previous inputs via a RNN, chen2016infogan which learn special representations of its latent variables for an interpretable conditional GAN based model. durugkar2016generative also looked upon multi-agent GANs but their model was based on multiple discriminators rather than multiple generators and based on ensemble based principles rather than message passing based objective. liu2016coupled

learn a joint distribution of images by coupling a pair of GANs i.e. jointly training a pair of generator-discriminator such that some of the initial layers of generators have shared weights and similarly some of the last layers of the discriminators have shared weights.

Message Passing Models and Co-operating Agents

Belief propagation (weiss2001optimality

) based message passing had been one of the major learning algorithms employed as the principal training procedure in Probabilistic Graphical Models. The paradigm of co-operating agents has been looked upon in Game Theory (

cai2011minmax). foerster2016learning and sukhbaatar2016learning introduce formulations of co-operating agents with a message passing model and a common communication channel respectively. lazaridou2016multi recently introduced a framework for networks to work co-operatively and introduce a bottleneck that forces the networks to pass messages which are even interpretable by humans.

Competing Agents

Although the Generative Adversarial Networks goodfellow2014generative is itself modeled as an adversarial game between 2 agents but with the advent of the competing objective even between the competing generators the generators start venturing into slightly different modes in the underlying noise space exploring greater modes of data. lee2016stochastic is a work that incorporates competition between deep ensembles by passing the gradient to the best network. abadi2016learning formulated a neural cryptography based framework in which Eve is an adversary and Alice and Bob work co-operatively in order to hide sensitive information from Eve.

3 Models

With the introduction of multiple generators, we add another set of objectives which helps us understand the dynamics of the system. We also introduce a version of Message Passing Generative Adversarial Networks with several variations which pass messages in order to make the generations of both better. The message passing model is augmented with several bottlenecks which encourage the generators to pass meaningful messages.

Competing Objective

The competing objective that we introduced is based on the principle that the Generators also compete with each other to get better scores for its generations from the Discriminator. The minimization objective function for the generator is:

(1)

while the minimization objective function for generator is:

(2)

where f(x) = so that the optimization objective for pushes it to get better scores from the Discriminator and vice versa for .

Conceding Objective

The principle behind the introduction of this objective is that the 2 generators try to guide each other in order to get better scores for its generations from the Discriminator. This model is similar in structure to the Competing Objective but the crucial difference is in the function used. Here, the minimization objective function for the generator is:

(3)

while the minimization objective function for the generator is:

(4)

where f(x) = so that the generations of are better than that of .

Message Passing

Our Message Passing model is based upon the principle that the messages passed between the 2 Generators will make the generators explore different subspaces of the image manifold and also provide better training for the discriminator as a regularization by introducing different types of images to the Discriminator.

Message Passing Model

Figure 4: Model based on message passing without condition. The pair of Message Generators share the parameters between themselves.

Each Generator generates images conditioned upon the message that it receives from the other generator and the noise sampled from the noise distribution. After both the generators have generated their respective images, a common message generator with shared parameters takes the image as input and generates message and the message generated from each generator’s image is passed to the other generator as a message in the next iteration. We also experimented with individual message generators for each generator but a common message generator works better because the messages are transferred between the 2 generators and meaningful messages can only be produced if the same network can gauge the generations by the 2 generators.

The minimization objective function for the generator is

(5)

where x is composed of noise obtained from distribution and message passed by . The message is initialized with same distribution as noise. is the message for the Generator G1 created by the message generator in the previous iteration.

Similarly, the minimization objective function for the generator is

(6)

The discriminator is trained such that both the generations by and are labeled to be as fake by the discriminator.

Conditioned Message Passing Model

Figure 5: Model based on message passing with condition. Both the pairs of Message Generators and Encoders share the parameters between themselves.

Each generated image is passed to Message Generator which creates an output. This output along with the generator’s input is encoded using a multi-layer perceptron called Encoder to create the message. As the message is conditioned both on the generation and the input of the generator, the encoder can create much better messages as it knows what factors led to the generation.

The objective being minimized by the generator is

(7)

where x is composed of noise obtained from distribution and message passed by . The message is initialized with same distribution as noise. Analogously, the objective being minimized by the Generator is

(8)

The message passing model is oblivious to the input that the generator received in order to generate the images and hence doesn’t give good generations, on the other hand the conditioned message generation gives much better generations because the messages are also conditioned on the input and the output of the generators.

We consider three different bottlenecks in order to force the messages to be meaningful:

Different Noise Distributions

The noise and that each of the Generators and get are sampled from different distributions. The principle behind the introduction of this bottleneck is that the generators would be able to master the modes in the 2 kinds of noise distributions and additionally that the messages will be forced to be different from mirroring the trivial noise distribution that was initially started off with. More concretely and was used for the training of the pair of generators.

Conceding Objective

In order for the generators to co-operate and pass meaningful messages and make each other better we provide a model where the generators’ objective function tries to make the other generator’s generations get better scores from the discriminator and passes messages accordingly.

The objective function minimized by the generator is:

(9)

and similar version for the generator

Competing Objective

In order to see the effects of the competing objective with the message passing model and whether rogue messages are passed in order to get better scores for its generations from the other discriminator, we provide a model of message passing GANs which compete with each other along with passing messages.

The structure is same as the message passing with condition. The objective function minimized by the generator is:

(10)

and similar version for the generator

Discriminator Objective

In the simplest version (when generators don’t pass messages) the objective function being maximized by the discriminator is:

(11)

When generators pass messages, only the input of the generators will change to include messages as well.

4 Experimental Setup

Model Architecture Details

The Generator and the Discriminator’s architecture was unaltered from radford2015unsupervised while the only change was with the introduction of the message generator which has an almost identical architecture as the Discriminator but with the modification of changing the number of filters to the message dimension of the final output. On extensive experimentation with the different dimensions used for the message the best results were produced when the dimension of the message was 50. The experiments done are:

Classification

The representation of the image obtained by passing the real images through the discriminator as employed by radford2015unsupervised was used alongside a novel feature representation enabled by our formulation of the message generator. The interesting aspect of the message generator is that it never got to see the real images, it just got to see the generated images by the 2 Generators and still when its feature representation is used it still gives interesting results. The dataset used for the classification examples is the Street View House Numbers Dataset goodfellow2013multi which was used by radford2015unsupervised and also salimans2016improved for the evaluation of their techniques. Ablation studies were performed to identify the benefits from the discriminator representation and the message representation individually as well.

Clustering

The celebrity dataset liu2015faceattributes was used to partition the faces based on the type of hair into five categories: bald, black, brown, blond and gray. The images belonging to these partitions were passed through the Message Generator to get the representations of each of the images from the Message Generator. The representations are then reduced to 2 dimensions using T-SNE and then they are represented using the different colors. Somewhat meaningful clusters start emerging from this exercise.

Visualization

With the introduction of the message passing mechanism, visualization was done by varying the messages or the noise in order to interpret the manifold learnt by the pair of Generators.

  • Message Interpolation: Keeping the noise constant between the 2 generators we can understand the structure of the messages learnt by varying the message for creating the generations.

  • Noise Interpolation: Keeping the messages constant between the 2 generators we can understand the impact of the noise by interpolation between the 2 noise.

A very interesting insight emerged is that the interpolation between the messages showed major content of the image while the interpolation between the noise produced texture changes in the image. This phenomenon would be elucidated in the results and analysis section further.

5 Results and Analysis

Classification Results on SVHN

As shown in Table 1 all of the models’ discriminator representation improved the results over the discriminator representation of DCGAN radford2015unsupervised thus showing that the proposed models provide regularization to the training procedure of the discriminator. The non-trivial accuracy obtained by the message representation which never got to see the real images is an interesting phenomena while the improvement of the accuracy with the Message alongside the Discriminator features shows that the Message representation learns some complementary features which helps in the overall classification task. The conceding objective performs better than the competing objective in the absence of message passing while it lags behind the competing objective when message passing is introduced as well. The message passing in itself doesn’t perform well as compared to the conditioned message passing with respect to the experiment performed on the generators getting noise from different noise distributions hence the rest of the message passing experiments were conducted with the conditioned message generator based architecture.

Model Discriminator Rep Message Rep Msg+Disc Rep
DCGAN radford2015unsupervised 22.48% NA NA
Improved GANs salimans2016improved 8.11 1.3 % NA NA
Different Noise MP 20.1% 53.48% 18.7%
Different Noise CMP 17.1% 54.21% 15.2%
Conceding CMP 18.37% 64.46% 17.4%
Competing CMP 17.76% 52.05% 16.8%
Competing Objective 18.02% NA NA
Conceding Objective 17.56% NA NA
Table 1: Error in Classification on SVHN. MP stands for message passing while CMP stands for conditioned message passing.

Clustering

Figure 6: Clustering messages from different noise distribution with conditioned message.

As described in the experimental section the clustering was performed with the messages and visualized using T-SNE on the 2-D space. As evident from the clustering results there emerge some clusters from the messages based on the disjoint division of the hair style. As evident from Fig. 6 the messages for the bald hair style totally separates from the rest while black and brown being a bit subjective are similar in the message space but some clusters for black hair emerge which are totally pure. Gray hair also separates quite clearly from the rest.

Competing Objective

Interpolation results for competing objective.

Figure 7: Noise interpolation for of Competing Objective. The generations obtained for noise interpolation over generator move from an old man wearing spectacles to a smiling lady without spectacles. The generations seem realistic. The generator is able to capture the facial details and even the direction of lightning.
Figure 8: Noise interpolation for of Competing Objective. It shows that the generations by are more like animated characters. The generator is able to capture the dominating features but not the texture of the images. As the noise interpolation is done, the generations go from a cartoon character to its human version.

Conceding Objective

Interpolation results for conceding objective.

Figure 9: Noise interpolation for of Conceding Objective. For conceding objective, the generations go from spike hairs to normal hairs to black spectacles. Also, the face shifts us and changes to smile.
Figure 10: Noise interpolation for of Conceding Objective. The generated personalities are changing their mood from smile to sadness to shock. With the interpolation of noise, are also changing their orientation from straight to tilted.

Message Passing

We consider three bottlenecks:

Different Noise Distribution

As evident from Table 1 that in case of different noise distribution, one with condition performs better, we consider only conditioned message passing in the next two bottlenecks:

Competing Objective

Interpolation results for competing objective with conditioned message.

Figure 11: Noise interpolation for of Competing Objective with conditioned message. We see that the generator is able to learn minute details of the face and later on it is getting artistic and able to generate an angel like image with varied color schemes.
Figure 12: Noise interpolation for of Competing Objective with conditioned message. It’s easy to see that the is not learning very detailed features. As the interpolation is done between the noise, the figure goes from loose hair to tied hair without making any other strong changes.
Figure 13: Message interpolation for of Competing Objective with conditioned message. It is made by interpolating between messages for generator . It shows the figure changing from a smiling woman to a smiling man. The direction face is pointing to also changes.
Figure 14: Message interpolation for of Competing Objective with conditioned message. It shows that hasn’t learnt detailed features. The generations go from longer hair to shorter ones with changes in lightning.

Conceding Objective

Interpolation results for conceding objective with conditioned message.

Figure 15: Noise interpolation for of Conceding Objective with conditioned message. With the change in noise, the lighting condition is changing. The only visible change is the appearance of face.
Figure 16: Noise interpolation for of Conceding Objective with conditioned message. We see that is modeling a cartoon figure with a band on forehead. With the interpolation in noise, the band is changing into hair.
Figure 17: Message interpolation for of Conceding Objective with conditioned message. It seems to be the way an artist adds attributes to the face beginning with the left eye. Also the facial details get more prominent with images.
Figure 18: Message interpolation for of Conceding Objective with conditioned message. It shows the generations going from young person with dense hairs to old person with sparse hairs. The direction where the person is looking also changes.

6 Conclusion

We presented several novel architectures and objectives aimed at training multi-agent GANs along with bottlenecks such as the generators receiving noise from different noise distributions, competing generators which compete with each other, conceding generators which tries to encourage the other generator to perform better than itself. As is evident from the experiments the models learn meaningful representations. The introduction of the architecture regularizes the training of the discriminator as is evident from the improved results of the discriminator. The representations obtained from the message generator are quite valuable in itself as is evident from the high accuracy obtained from its representation that was not even shown the real images.

References