Graphs are ubiquitous in many areas of science and engineering. Graph generation is one of the essential research lines in graph studies, which dates back to several decades ago 
. In recent years, with the increasing popularity of deep generative models, graph generation research has again become a hot topic, extending a various number of deep learning frameworks, with applications ranging from drug discovery to social network analysis . For example,  and  have proposed GAN-based models, while  exploit the VAE framework.
However, there is still much work to be done for conditional graph generation compared to similar studies in other domains, such as text and image. For example, conditionally generating new samples is advantageous for generative models, as it provides the means of inducing desired characteristics in the generated outputs. In this regard, there exists many text generators that are conditioned on an input label [17, 18], or sequence , or image generators that can apply certain constraints on their generated output [12, 19]. With unconditional generators, we cannot control the produced data. Thus, one of the significant issues in designing generative models is creating data samples under different conditions. The generated samples have some desired characteristics based on class labels or any other additional condition.
This issue is also a concern with graph data structures, where constructing graphs with given domain-specific conditions can be of great significance, such as particular molecular optimization for drug discovery . Considering this, GraphVAE  presents a conditional setting on their model, by utilizing a molecule decoder, for the specified task of molecular graph generation. Though they still can’t guarantee the semantic (chemical) validity of the generated molecules. Recently, AGE  has proposed a deep generative model with attention, which can be conditioned on an input graph and outputs a transformed version of it, which can be analogous to its evolution. However, the conditional generation problem has not been directly studied in recent graph studies, whereas it has been more examined in other fields. Existing graph generation methods have approached this problem with varying procedures like directly injecting validity constraints , or latent space optimization .
In this paper, we study the novel problem of class-conditioned graph generation, whose goal is to learn and generate graph structures given the class information. In this regard, we propose CCGG, an autoregressive model for generating graphs under different class labels, as a condition. CCGG extends GRAN 
, an architecture that takes advantage of graph recurrent attention networks to generate one block of nodes and its associated edges at each step while preserving the quality of the generated samples. Our model consists of three main parts. First, we train a graph classifier using the GraphSAGE framework
and exploit the trained classifier while training the main model. Second, we inject class-condition vectors into the input representations of graph nodes, making CCGG capable of considering class labels and simultaneously maintaining the time efficiency and quality of generated graphs. Third, we include two new loss functions in our model to capture the requirements of the classification task. The overview of our model is depicted in Fig1.
This section, presents our CCGG model, a deep autoregressive model for the class-conditional graph generation. The method adopts a recently introduced deep generative model of graphs. Specifically, the GRAN model , as the core generation strategy due to its state-of-the-art performance among other graph generators. The CCGG then injects the class label of the graphs into the training procedure to make the model class-conditional. In the following subsections, we first provide a background on the GRAN model. Then, we introduce our model by discussing its components, including the node representations, edges generation, classification step, and loss function. Moreover, we examine the gradient flow problem in the training phase, and how we resolved it.
2.1 Background: GRAN model
GRAN is a deep generative model for the undirected graph , where is the graph’s node-set, and represents its edge set. Instead of employing RNN-based graph generative models used in previous works 
, GRAN uses Graph Neural Networks (GNNs) with an attention mechanism, enabling the model to benefit from the advantages of GNNs in modeling the intrinsic characteristics of graphs. This also helps overcome the long-term bottleneck of the RNNs. Given the node ordering, denotes the adjacency matrix of an undirected graph under , and represents the lower triangular part of it. Assuming the symmetry of , we have . At each step, a block of size is generated, which indicates rows of . Each block in the step, is represented as , and is a sequence of vectors , where
. Thus, the probability of generating, can be presented as:
GRAN also considers a family of canonical node orderings to estimate the true log-likelihood, which is intractable due to the factorial number of orderings in terms of the number of graph nodes.
2.2.1 Nodes Representations and Edge Generation
Inspired by , at the step, first, we obtain an initial node representation for the previously generated nodes by concatenating the outputs of two linear mappings:
where is a block of vectors sized , representing the elements of in the rows , and is the maximum allowed number of graph nodes. Thus, is the pre-initial representation for the block nodes, calculated via a linear mapping. Moreover, is the class-conditional vector of the graph and is obtained by applying a linear mapping on it. These linear mappings are mainly utilized to reduce the size of vectors in large graphs and make the model more flexible in modeling features of each class. Finally, we obtain the initial node representations of , by concatenating and , which enables the model to generate new blocks with respect to the desired conditions on the graph. Since, is not yet generated for the current block, we set = .
Then, the initial node representations are given to a GNN, similar to the one used in GRAN, to obtain the final representations for each node , in steps. Finally, according to the GRAN, the edges connecting the nodes of the
block to other nodes are generated using a mixture of Bernoulli distributions. The parameters of the distributions are obtained via two MLPs with ReLU nonlinearities, using the final node representations derived from the last step.
Therefore, the probability of generating the block can be written as:
Using components of a mixture of Bernoulli distributions, lets us assume the dependency of each block’s generated edges. However, in the case of , the distribution turns into a single Bernoulli, assuming the independence of the new edges, which may not be valid.
|NCI1||0||55.32 %||61.00%||71.73 %||70.37 %||64.81 %||51.61 %|
|1||78.07 %||81.91 %||62.11 %||67. 01 %||69.27 %||74.71 %|
|PROTEINS||0||74.51 %||75.16 %||81.62 %||88.81 %||71.09 %||71.55 %|
|1||58.47 %||56.23 %||63.13 %||65.24 %||55.38 %||60.48 %|
2.2.2 Node Labels
To predict the node labels that may be present in the dataset, we have appended a two-layer MLP with ReLU nonlinearity to CCGG, which uses the final node representations as its input. This component was included in the model for utilization in the classification step.
We have employed GraphSAGE 
, a promising framework for inductive representation learning on large graphs, to train a classifier before training the main model by attaching a fully connected layer with a Sigmoid activation function to it. During each step of training the main model, the updated graph is fed to the classifier for the classification task and calculating the corresponding loss.
2.2.4 Loss Function
The loss function of our model consists of three components. First, it is the loss of the GRAN model denoted as . As mentioned in the Subsection 2.1, its loss function is the negative log-likelihood of the generated graphs, estimated by considering a family of canonical node orderings. Second, we have the classification loss denoted as , which is calculated as the cross-entropy loss of the generated samples , multiplied by a power of the discount factor at each step:
Where and represent the number of the original graph’s and the input subgraph’s nodes, respectively. The idea behind using the discount factor is that, in the generation process, the closer the size of the subgraph to the original graph, the easier it is to add a new node. More precisely, assuming that the input subgraph has nodes, it’s relatively easier to reconstruct the original graph by generating the only remaining node while preserving the desired graph properties. On the other hand, taking the subgraph with only of the nodes, it would be far more difficult for the model to generate a new node (or block of nodes) having the desired conditions.
Thirdly, we include the loss of the node label’s prediction , which is the cross-entropy loss of the generated node label , calculated the same way as .
Summing up these losses, the total loss for the CCGG model can be written as:
are the model’s hyperparameters, weighting each loss in the training phase. They can be initiated for each dataset.
2.2.5 Gradient Flow Problem
The discrete nature of graphs will cause a gradient flow problem in backpropagating the classifier’s loss
. Specifically, to feed adjacency matrices to the classifier, we need to sample the output Bernoulli distributions at each step, which makes the backpropagation intractable. We have solved this problem by using stochastic binary neurons, a technique proposed in. Following this method, we sample stochastic binary neurons for generating new edges in the training phase, and for the evaluation phase, we sample the mixture of Bernoulli distributions.
In this section, we validate the efficacy of our proposed model in generating class-conditioned graphs by performing experiments on two different real-world datasets. We have also tested CCGG’s outputs against two other graph classifiers, namely DGCNN  and DiffPool , to confirm its validity. Note that these classifiers were not used during the training phase.
PROTEINS: This dataset contains 1113 graphs, having from 100 to 620 nodes. Each graph represents a protein structure, with amino acids as their nodes, and an edge connects two nodes if they are less than 6 Angstroms apart. These graphs are categorized into two enzymes or non-enzymes classes.
NCI1: NCI1 is a cheminformatics dataset including 4110 graphs representing chemical compounds classified as positive or negative to cell lung cancer. The compound’s molecule atoms form the graph’s nodes, and their bonds denote the edges. Their number of nodes ranges from 8 to 111.
3.2 Compared method
The problem of class-conditional graph generation is a relatively unexplored one. Therefore, there are not many baseline models available for comparison. We test the performance of CCGG against CONDGEN , a formerly proposed method, with state-of-the-art results in conditional graph generation, using Generative Adversarial Nets (GAN). However, since CONDGEN doesn’t directly address the class-conditional problem, we have set its condition to the class label for models’ equivalence. We have used the same experimental setup during the train and test steps to perform a fair comparison with the baseline.
For evaluating the generated graphs, we have used some statistics-based metrics111Our statistics-based metrics are LCC (Largest Connected Component), TC (Triangle Count), Mean D (Mean Degree of Nodes), GINI (gini index) of the degree distribution, and CPL (Characteristic Path Length). describing different graph features. According to the results of these metrics shown in Table 2, CCGG outperforms the baseline results. Furthermore, we have reported the accuracy of classification in each class of the datasets, as well as the Area Under the Curve (AUC), in the Tables 2 and 3, respectively. Our model has achieved better results compared to CONDGEN in both datasets, with very few exceptions. Utilizing the class vectors in the model’s input and using the node label predictor facilitates the class-conditional generation task. The quality of the generated graphs are also preserved and, in most cases, improved (as confirmed by Table 2). This is because the model is trained to minimize the CCGG’s loss function in Eq. (10), which maintains the balance between each task.
We have used three GraphSAGE convolutional layers for our classifier model, with 32 output channels, followed by a two-layer MLP with ReLU non-linearity. The final output of the classifier is equal to the number of classes in each dataset. Then, for the main CCGG model, node representation dimensions are set to 512 for the NCI1 and 384 for the PROTEINS. In each case, a subvector of 256 is used for representing the class data. Recommended by the GRAN paper and confirmed by our results, we have set the number of Bernoulli components to 20 and used 7 layers of GNNs. For our node label predictor, we used a three-layer MLP with ReLU non-linearity. Its input is the node representation, where the hidden dimensions of middle layers are set to 256. Also, the output’s size equals the number of node labels in each dataset, which is 3 and 38 for the PROTEINS and NCI1 datasets, respectively. Also, we have set ’s value to 0.8 for our loss function’s calculation. Moreover, we exploited the Adam optimizer for training different parts of our model.
This paper introduced CCGG, an autoregressive model for solving class-conditioned graph generation problems based on a previously proposed generative model. We used two real-world datasets and achieved state-of-the-art performance on them. Our future direction will focus on extending the model’s scalability in generating larger graphs. We hope this work will inspire further research on new applications of class-conditional graph generation.
NetGAN: generating graphs via random walks.
International Conference on Machine Learning, pp. 609–618. Cited by: §1.
-  (1960) On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5 (1), pp. 17–60. Cited by: §1.
-  (2021) Deep graph generators: a survey. IEEE Access 9, pp. 106675–106702. Cited by: §1.
-  (2020) Attention-based graph evolution. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 436–447. Cited by: §1.
-  (2020) Multi-motifgan (mmgan): motif-targeted graph generation and prediction. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4182–4186. Cited by: §1.
C-textgen: conditional text generation for harmonious human-machine interaction. CoRR abs/1909.03409. External Links: Cited by: §1.
-  (2017) Inductive representation learning on large graphs. In Advances in neural information processing systems, pp. 1024–1034. Cited by: §1, §2.2.3.
-  (2019) Learning multimodal graph-to-graph translation for molecular optimization. External Links: Cited by: §1.
-  (2018) Multi-objective de novo drug design with conditional graph generative model. Journal of cheminformatics 10 (1), pp. 33. Cited by: §1.
-  (2019) Efficient graph generation with graph recurrent attention networks. In Advances in Neural Information Processing Systems, pp. 4257–4267. Cited by: §1, §2.2.1, §2.
Constrained graph variational autoencoders for molecule design. In Advances in neural information processing systems, pp. 7795–7804. Cited by: §1.
-  (2017) Conditional image generation using feature-matching gan. In 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Vol. , pp. 1–5. External Links: Cited by: §1.
-  (2018) Constrained generation of semantically valid graphs via regularizing variational autoencoders. In Advances in Neural Information Processing Systems, pp. 7113–7124. Cited by: §1.
-  (2019) Probabilistic computing with binary stochastic neurons. In 2019 IEEE BiCMOS and Compound semiconductor Integrated Circuits and Technology Symposium (BCICTS), Vol. , pp. 1–6. External Links: Cited by: §2.2.5.
DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Networks 108, pp. 533–543. External Links: Cited by: §3.
-  (2018) Graphvae: towards generation of small graphs using variational autoencoders. In International Conference on Artificial Neural Networks, pp. 412–422. Cited by: §1, §1.
-  (2018) SentiGAN: generating sentimental texts via mixture adversarial networks. In IJCAI, Cited by: §1.
-  (2019) Topic-guided variational autoencoders for text generation. External Links: Cited by: §1.
Attribute2image: conditional image generation from visual attributes.
European Conference on Computer Vision, pp. 776–791. Cited by: §1.
-  (2019) Conditional structure generation through graph variational generative adversarial nets. In Advances in Neural Information Processing Systems, pp. 1340–1351. Cited by: §3.2.
-  (2018) Hierarchical graph representation learning with differentiable pooling. CoRR abs/1806.08804. External Links: Cited by: §3.
-  (2018) GraphRNN: generating realistic graphs with deep auto-regressive models. In International Conference on Machine Learning, pp. 5708–5717. Cited by: §2.1.