1 Introduction
Graphs are ubiquitous in many areas of science and engineering. Graph generation is one of the essential research lines in graph studies, which dates back to several decades ago [2]
. In recent years, with the increasing popularity of deep generative models, graph generation research has again become a hot topic, extending a various number of deep learning frameworks, with applications ranging from drug discovery to social network analysis
[9] [3]. For example, [1] and [5] have proposed GANbased models, while [16] exploit the VAE framework.However, there is still much work to be done for conditional graph generation compared to similar studies in other domains, such as text and image. For example, conditionally generating new samples is advantageous for generative models, as it provides the means of inducing desired characteristics in the generated outputs. In this regard, there exists many text generators that are conditioned on an input label [17, 18], or sequence [6], or image generators that can apply certain constraints on their generated output [12, 19]. With unconditional generators, we cannot control the produced data. Thus, one of the significant issues in designing generative models is creating data samples under different conditions. The generated samples have some desired characteristics based on class labels or any other additional condition.
This issue is also a concern with graph data structures, where constructing graphs with given domainspecific conditions can be of great significance, such as particular molecular optimization for drug discovery [8]. Considering this, GraphVAE [16] presents a conditional setting on their model, by utilizing a molecule decoder, for the specified task of molecular graph generation. Though they still can’t guarantee the semantic (chemical) validity of the generated molecules. Recently, AGE [4] has proposed a deep generative model with attention, which can be conditioned on an input graph and outputs a transformed version of it, which can be analogous to its evolution. However, the conditional generation problem has not been directly studied in recent graph studies, whereas it has been more examined in other fields. Existing graph generation methods have approached this problem with varying procedures like directly injecting validity constraints [13], or latent space optimization [11].
In this paper, we study the novel problem of classconditioned graph generation, whose goal is to learn and generate graph structures given the class information. In this regard, we propose CCGG, an autoregressive model for generating graphs under different class labels, as a condition. CCGG extends GRAN [10]
, an architecture that takes advantage of graph recurrent attention networks to generate one block of nodes and its associated edges at each step while preserving the quality of the generated samples. Our model consists of three main parts. First, we train a graph classifier using the GraphSAGE framework
[7]and exploit the trained classifier while training the main model. Second, we inject classcondition vectors into the input representations of graph nodes, making CCGG capable of considering class labels and simultaneously maintaining the time efficiency and quality of generated graphs. Third, we include two new loss functions in our model to capture the requirements of the classification task. The overview of our model is depicted in Fig
1.2 Method
This section, presents our CCGG model, a deep autoregressive model for the classconditional graph generation. The method adopts a recently introduced deep generative model of graphs. Specifically, the GRAN model [10], as the core generation strategy due to its stateoftheart performance among other graph generators. The CCGG then injects the class label of the graphs into the training procedure to make the model classconditional. In the following subsections, we first provide a background on the GRAN model. Then, we introduce our model by discussing its components, including the node representations, edges generation, classification step, and loss function. Moreover, we examine the gradient flow problem in the training phase, and how we resolved it.
2.1 Background: GRAN model
GRAN is a deep generative model for the undirected graph , where is the graph’s nodeset, and represents its edge set. Instead of employing RNNbased graph generative models used in previous works [22]
, GRAN uses Graph Neural Networks (GNNs) with an attention mechanism, enabling the model to benefit from the advantages of GNNs in modeling the intrinsic characteristics of graphs. This also helps overcome the longterm bottleneck of the RNNs. Given the node ordering
, denotes the adjacency matrix of an undirected graph under , and represents the lower triangular part of it. Assuming the symmetry of , we have . At each step, a block of size is generated, which indicates rows of . Each block in the step, is represented as , and is a sequence of vectors , where. Thus, the probability of generating
, can be presented as:(1) 
GRAN also considers a family of canonical node orderings to estimate the true loglikelihood
, which is intractable due to the factorial number of orderings in terms of the number of graph nodes.2.2 Ccgg
2.2.1 Nodes Representations and Edge Generation
Inspired by [10], at the step, first, we obtain an initial node representation for the previously generated nodes by concatenating the outputs of two linear mappings:
(2) 
(3) 
(4) 
where is a block of vectors sized , representing the elements of in the rows , and is the maximum allowed number of graph nodes. Thus, is the preinitial representation for the block nodes, calculated via a linear mapping. Moreover, is the classconditional vector of the graph and is obtained by applying a linear mapping on it. These linear mappings are mainly utilized to reduce the size of vectors in large graphs and make the model more flexible in modeling features of each class. Finally, we obtain the initial node representations of , by concatenating and , which enables the model to generate new blocks with respect to the desired conditions on the graph. Since, is not yet generated for the current block, we set = .
Then, the initial node representations are given to a GNN, similar to the one used in GRAN, to obtain the final representations for each node , in steps. Finally, according to the GRAN, the edges connecting the nodes of the
block to other nodes are generated using a mixture of Bernoulli distributions. The parameters of the distributions are obtained via two MLPs with ReLU nonlinearities, using the final node representations derived from the last step.
(5) 
(6) 
Therefore, the probability of generating the block can be written as:
(7) 
Using components of a mixture of Bernoulli distributions, lets us assume the dependency of each block’s generated edges. However, in the case of , the distribution turns into a single Bernoulli, assuming the independence of the new edges, which may not be valid.
Dataset  Properties  LCC  TC  CPL  Mean D  GINI  

ModelsClasses  0  1  0  1  0  1  0  1  0  1  
NCI1  Real  25.206  36.052  0.034  0.092  4.974  6.239  2.163  2.194  0.085  0.120 
CondGen  9.971  16.043  15.026  27.097  2.605  3.726  0.839  0.620  0.338  0.394  
CCGG  1.225  1.416  1.873  2.397  0.925  1.647  0.207  0.264  0.022  0.032  
PROTEINS  Real  47.670  22.267  34.302  17.242  5.605  3.330  3.798  3.641  0.052  0.035 
CondGen  19.625  6.578  155.396  38.912  3.741  1.662  0.149  1.520  0.479  0.255  
CCGG  3.330  2.347  73.205  11.051  3.009  1.074  1.679  0.727  0.056  0.005 
Dataset  Classes  GraphSAGE  DiffPool  DGCNN  

CondGen  CCGG  CondGen  CCGG  CondGen  CCGG  
NCI1  0  55.32 %  61.00%  71.73 %  70.37 %  64.81 %  51.61 % 
1  78.07 %  81.91 %  62.11 %  67. 01 %  69.27 %  74.71 %  
PROTEINS  0  74.51 %  75.16 %  81.62 %  88.81 %  71.09 %  71.55 % 
1  58.47 %  56.23 %  63.13 %  65.24 %  55.38 %  60.48 % 
Dataset  GraphSAGE  DiffPool  DGCNN  
CondGen  CCGG  CondGen  CCGG  CondGen  CCGG  
NCI1  0.66  0.71  0.63  0.65  0.65  0.63 
PROTEINS  0.67  0.69  0.72  0.77  0.59  0.67 
2.2.2 Node Labels
To predict the node labels that may be present in the dataset, we have appended a twolayer MLP with ReLU nonlinearity to CCGG, which uses the final node representations as its input. This component was included in the model for utilization in the classification step.
2.2.3 Classification
We have employed GraphSAGE [7]
, a promising framework for inductive representation learning on large graphs, to train a classifier before training the main model by attaching a fully connected layer with a Sigmoid activation function to it. During each step of training the main model, the updated graph is fed to the classifier for the classification task and calculating the corresponding loss.
2.2.4 Loss Function
The loss function of our model consists of three components. First, it is the loss of the GRAN model denoted as . As mentioned in the Subsection 2.1, its loss function is the negative loglikelihood of the generated graphs, estimated by considering a family of canonical node orderings. Second, we have the classification loss denoted as , which is calculated as the crossentropy loss of the generated samples , multiplied by a power of the discount factor at each step:
(8) 
Where and represent the number of the original graph’s and the input subgraph’s nodes, respectively. The idea behind using the discount factor is that, in the generation process, the closer the size of the subgraph to the original graph, the easier it is to add a new node. More precisely, assuming that the input subgraph has nodes, it’s relatively easier to reconstruct the original graph by generating the only remaining node while preserving the desired graph properties. On the other hand, taking the subgraph with only of the nodes, it would be far more difficult for the model to generate a new node (or block of nodes) having the desired conditions.
Thirdly, we include the loss of the node label’s prediction , which is the crossentropy loss of the generated node label , calculated the same way as .
(9) 
Summing up these losses, the total loss for the CCGG model can be written as:
(10) 
Where and
are the model’s hyperparameters, weighting each loss in the training phase. They can be initiated for each dataset.
2.2.5 Gradient Flow Problem
The discrete nature of graphs will cause a gradient flow problem in backpropagating the classifier’s loss
. Specifically, to feed adjacency matrices to the classifier, we need to sample the output Bernoulli distributions at each step, which makes the backpropagation intractable. We have solved this problem by using stochastic binary neurons, a technique proposed in
[14]. Following this method, we sample stochastic binary neurons for generating new edges in the training phase, and for the evaluation phase, we sample the mixture of Bernoulli distributions.3 Experiments
In this section, we validate the efficacy of our proposed model in generating classconditioned graphs by performing experiments on two different realworld datasets. We have also tested CCGG’s outputs against two other graph classifiers, namely DGCNN [15] and DiffPool [21], to confirm its validity. Note that these classifiers were not used during the training phase.
3.1 Datasets
PROTEINS: This dataset contains 1113 graphs, having from 100 to 620 nodes. Each graph represents a protein structure, with amino acids as their nodes, and an edge connects two nodes if they are less than 6 Angstroms apart. These graphs are categorized into two enzymes or nonenzymes classes.
NCI1: NCI1 is a cheminformatics dataset including 4110 graphs representing chemical compounds classified as positive or negative to cell lung cancer. The compound’s molecule atoms form the graph’s nodes, and their bonds denote the edges. Their number of nodes ranges from 8 to 111.
3.2 Compared method
The problem of classconditional graph generation is a relatively unexplored one. Therefore, there are not many baseline models available for comparison. We test the performance of CCGG against CONDGEN [20], a formerly proposed method, with stateoftheart results in conditional graph generation, using Generative Adversarial Nets (GAN). However, since CONDGEN doesn’t directly address the classconditional problem, we have set its condition to the class label for models’ equivalence. We have used the same experimental setup during the train and test steps to perform a fair comparison with the baseline.
3.3 Metrics
For evaluating the generated graphs, we have used some statisticsbased metrics^{1}^{1}1Our statisticsbased metrics are LCC (Largest Connected Component), TC (Triangle Count), Mean D (Mean Degree of Nodes), GINI (gini index) of the degree distribution, and CPL (Characteristic Path Length). describing different graph features. According to the results of these metrics shown in Table 2, CCGG outperforms the baseline results. Furthermore, we have reported the accuracy of classification in each class of the datasets, as well as the Area Under the Curve (AUC), in the Tables 2 and 3, respectively. Our model has achieved better results compared to CONDGEN in both datasets, with very few exceptions. Utilizing the class vectors in the model’s input and using the node label predictor facilitates the classconditional generation task. The quality of the generated graphs are also preserved and, in most cases, improved (as confirmed by Table 2). This is because the model is trained to minimize the CCGG’s loss function in Eq. (10), which maintains the balance between each task.
3.4 Setup
We have used three GraphSAGE convolutional layers for our classifier model, with 32 output channels, followed by a twolayer MLP with ReLU nonlinearity. The final output of the classifier is equal to the number of classes in each dataset. Then, for the main CCGG model, node representation dimensions are set to 512 for the NCI1 and 384 for the PROTEINS. In each case, a subvector of 256 is used for representing the class data. Recommended by the GRAN paper and confirmed by our results, we have set the number of Bernoulli components to 20 and used 7 layers of GNNs. For our node label predictor, we used a threelayer MLP with ReLU nonlinearity. Its input is the node representation, where the hidden dimensions of middle layers are set to 256. Also, the output’s size equals the number of node labels in each dataset, which is 3 and 38 for the PROTEINS and NCI1 datasets, respectively. Also, we have set ’s value to 0.8 for our loss function’s calculation. Moreover, we exploited the Adam optimizer for training different parts of our model.
4 Conclusion
This paper introduced CCGG, an autoregressive model for solving classconditioned graph generation problems based on a previously proposed generative model. We used two realworld datasets and achieved stateoftheart performance on them. Our future direction will focus on extending the model’s scalability in generating larger graphs. We hope this work will inspire further research on new applications of classconditional graph generation.
References

[1]
(2018)
NetGAN: generating graphs via random walks.
In
International Conference on Machine Learning
, pp. 609–618. Cited by: §1.  [2] (1960) On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5 (1), pp. 17–60. Cited by: §1.
 [3] (2021) Deep graph generators: a survey. IEEE Access 9, pp. 106675–106702. Cited by: §1.
 [4] (2020) Attentionbased graph evolution. In PacificAsia Conference on Knowledge Discovery and Data Mining, pp. 436–447. Cited by: §1.
 [5] (2020) Multimotifgan (mmgan): motiftargeted graph generation and prediction. In ICASSP 20202020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4182–4186. Cited by: §1.

[6]
(2019)
Ctextgen: conditional text generation for harmonious humanmachine interaction
. CoRR abs/1909.03409. External Links: Link, 1909.03409 Cited by: §1.  [7] (2017) Inductive representation learning on large graphs. In Advances in neural information processing systems, pp. 1024–1034. Cited by: §1, §2.2.3.
 [8] (2019) Learning multimodal graphtograph translation for molecular optimization. External Links: 1812.01070 Cited by: §1.
 [9] (2018) Multiobjective de novo drug design with conditional graph generative model. Journal of cheminformatics 10 (1), pp. 33. Cited by: §1.
 [10] (2019) Efficient graph generation with graph recurrent attention networks. In Advances in Neural Information Processing Systems, pp. 4257–4267. Cited by: §1, §2.2.1, §2.

[11]
(2018)
Constrained graph variational autoencoders for molecule design
. In Advances in neural information processing systems, pp. 7795–7804. Cited by: §1.  [12] (2017) Conditional image generation using featurematching gan. In 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISPBMEI), Vol. , pp. 1–5. External Links: Document Cited by: §1.
 [13] (2018) Constrained generation of semantically valid graphs via regularizing variational autoencoders. In Advances in Neural Information Processing Systems, pp. 7113–7124. Cited by: §1.
 [14] (2019) Probabilistic computing with binary stochastic neurons. In 2019 IEEE BiCMOS and Compound semiconductor Integrated Circuits and Technology Symposium (BCICTS), Vol. , pp. 1–6. External Links: Document Cited by: §2.2.5.

[15]
(2018)
DGCNN: a convolutional neural network over largescale labeled graphs
. Neural Networks 108, pp. 533–543. External Links: ISSN 08936080, Document Cited by: §3.  [16] (2018) Graphvae: towards generation of small graphs using variational autoencoders. In International Conference on Artificial Neural Networks, pp. 412–422. Cited by: §1, §1.
 [17] (2018) SentiGAN: generating sentimental texts via mixture adversarial networks. In IJCAI, Cited by: §1.
 [18] (2019) Topicguided variational autoencoders for text generation. External Links: 1903.07137 Cited by: §1.

[19]
(2016)
Attribute2image: conditional image generation from visual attributes.
In
European Conference on Computer Vision
, pp. 776–791. Cited by: §1.  [20] (2019) Conditional structure generation through graph variational generative adversarial nets. In Advances in Neural Information Processing Systems, pp. 1340–1351. Cited by: §3.2.
 [21] (2018) Hierarchical graph representation learning with differentiable pooling. CoRR abs/1806.08804. External Links: Link, 1806.08804 Cited by: §3.
 [22] (2018) GraphRNN: generating realistic graphs with deep autoregressive models. In International Conference on Machine Learning, pp. 5708–5717. Cited by: §2.1.
Comments
There are no comments yet.