1 Introduction
Many key applications require to generate objects that satisfy hard structural constraints, like drug molecules, which must be chemically valid, and game levels, which must be playable. Despite their impressive success (Karras et al., 2018; Zhang et al., 2017; Zhu et al., 2017), Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) struggle in these applications. The reason is that data alone are often insufficient to capture the structural constraints (especially if noisy) and convey them to the model.
As a remedy, we derive Constrained Adversarial Networks (CANs), which extend GANs to generating valid structures with high probabilty. Given a set of arbitrary discrete constraints, CANs achieve this by penalizing the generator for allocating mass to invalid objects during training. The penalty term is implemented using the semantic loss (SL) (Xu et al., 2018)
, which turns the discrete constraints into a differentiable loss function implemented as an arithmetic circuit (i.e., a polynomial). The SL is probabilistically sound, can be evaluated exactly, and supports endtoend training. Importantly, the arithmetic circuit – which can be quite large, depending on the complexity of the constraints – can be thrown away after training. CANs handle complex constraints, like reachability on graphs, by first embedding configurations in space in which the constraints can be encoded compactly, and then applying the SL to the embeddings.
Since the constraints are embedded directly into the generator, highquality structures can be sampled efficiently (in time practically independent of the complexity of the constraints) with a simple forward pass on the generator, as in regular GANs. No costly sampling or optimization steps are needed. We additionally show how to equip CANs with the ability to switch constraints on and off dynamically during inference, at no runtime cost.
Contributions. Summarizing, we contribute: 1) CANs, an extension of GANs in which the generator is encouraged at training time to generate valid structures and support efficient sampling, 2) native support for intractably complex constraints, 3) conditional CANs, an effective solution for dynamically turning on and off the constraints at inference time, 4) a thorough empirical study on realworld data showing that CANs generate structures that are likely valid and coherent with the training data.
2 Related Work
Structured generative tasks have traditionally been tackled using probabilistic graphical models (Koller and Friedman, 2009) and grammars (Talton et al., 2012), which lack support for representation learning and efficient sampling under constraints. Tractable probabilistic circuits (Poon and Domingos, 2011; Kisa et al., 2014) are a recent alternative that make use of ideas from knowledge compilation Darwiche and Marquis (2002) to provide efficient generation of valid structures. These approaches generate valid objects by constructing a circuit (a polynomial) that encodes both the hard constraints and the probabilistic structure of the problem. Although inference is linear in the size of the circuit, the latter can grow very large if the constaints are complex enough. In contrast, CANs model the probabilistic structure of the problem using a neural architecture, while relying on knowledge compilation for encoding the hard constraints during training. For this reason, the resulting circuit is much more compact. Moreover, the circuit can be discarded at inference time. The time and space complexity of sampling for CANs is therefore roughly independent from the complexity of the constraints in practice.
Deep generative models developed for structured tasks are specialpurpose, in that they rely on adhoc architectures, tackle specific applications, or do not support efficient sampling (Guimaraes et al., 2017; De Cao and Kipf, 2018; Xue and van Hoeve, 2019; Torrado et al., 2019)
. Some recent approaches have focused on incorporating a constraint learning component in training deep generative models, using reinforcement learning
De Cao and Kipf (2018) or inverse reinforcement learning Hu et al. (2018) techniques. This direction is complementary to ours and is useful when constraints are not known in advance or cannot be easily formalized as functions of the generator output. Indeed, our experiment on molecule generation shows the advantages of enriching CANs with constraint learning to generate high quality and diverse molecules.Other general approaches for injecting knowledge into neural nets (like deep statisticalrelational models (Lippi and Frasconi, 2009; Manhaeve et al., 2018; Marra and Kuželka, 2019)
, tensorbased models
(Rocktäschel and Riedel, 2017; Donadello et al., 2017), and fuzzy logicbased models (Marra et al., 2019)) are either not generative or require the constraints to be available at inference time.3 Unconstrained GANs
GANs (Goodfellow et al., 2014) are composed of two neural nets: a discriminator trained to recognize “real” objects sampled from the data distribution , and a generator
that maps random latent vectors
to objects that fool the discriminator. Learning equates to solving the minimax game with value function:(1) 
Here and are the distributions induced by the generator and discriminator, respectively. New objects can be sampled by mapping random vectors using the generator, i.e., . Under idealized assumptions, the learned generator matches the data distribution:
Theorem 1 (Goodfellow et al. (2014)).
In practice, training GANs is notoriously hard (Salimans et al., 2016; Mescheder et al., 2018). The most common failure mode is mode collapse, in which the generated objects are clustered in a tiny region of the object space. Remedies include using alternative objective functions Goodfellow et al. (2014), divergences (Nowozin et al., 2016; Arjovsky et al., 2017) and regularizers (Miyato et al., 2018). In our experiments, we apply some of these techniques to stabilize training.
In structured tasks, the objects of interest are usually discrete. In the following, we focus on stochastic generators that output a categorical distribution over and objects are sampled from the latter. In this case, .
4 Generating Structures with CANs
Our goal is to learn a deep generative model that outputs structures consistent with validity constraints and an unobserved distribution . We assume to be given: i) a feature map that extracts binary features from , and ii) a single validity constraint encoded as a Boolean formula on . Any discrete structured space can be encoded this way.
4.1 Limitations of GANs
Standard GANs are likely to output invalid structures, for two main reasons. First, the VC dimension of unrestricted discrete formulas is exponential in the number of variables Vapnik and Chervonenkis (2015). Hence, the number of examples necessary to capture any nontrivial constraint can be intractably large. This rules out learning the rules of chemical validity or, worse still, node reachability from even moderately large data sets. Second, in many cases of interest the examples are noisy and do violate . In this more challenging case, it can be shown that the data lures GANs into learning not to satisfy the constraint:
Corollary 1.
Under the assumptions of Theorem 1, given a target distribution , a constraint consistent with it, and a dataset of examples sampled i.i.d. from a corrupted distribution inconsistent with , GANs associate nonzero mass to infeasible objects.
4.2 Constrained Adversarial Networks
Constrained Adversarial Networks (CANs) avoid these issues by taking both the data and the target structural constraint as inputs. The value function is designed so that the generator maximizes the probability of generating valid structures. In order to derive CANs it is convenient to start from the following alternative GAN value function (Goodfellow et al., 2014): .
Let be a GAN and be a fixed discriminator that distinguishes between valid and invalid structures, where indicates logical entailment. Ideally, we wish the generator to never output invalid structures. This can be achieved by using an aggregate discriminator that only accepts configurations that are both valid and highquality w.r.t. . Let be the indicator that classifies as real, and similarly for and . By definition:
(2) 
Plugging the aggregate discriminator into the alternative value function gives:
(3)  
(4)  
(5)  
(6) 
The second step holds because does not depend on . If allocates nonzero mass to any measurable subset of invalid structures, the second term becomes . This is consistent with our goal but problematic for learning. A better alternative is to optimize the lower bound:
(7) 
This term is the semantic loss (SL) proposed in (Xu et al., 2018)
to inject knowledge into neural networks. The SL is much smoother than the original and it only evaluates to
if allocates all the mass to infeasible configurations. This immediately leads to the CAN value function:(8) 
where is a newlyintroduced hyperparameter controlling the importance of the constraint. The SL can be viewed as the negative loglikelihood of the constraint . This shows that it rewards the generator proportionally to the mass it allocates to valid structures. The SL can be rewritten as:
(9) 
Since the SL is the negative logarithm of a polynomial in , it is fully differentiable.^{1}^{1}1So long as , which is always the case in practice. In practice, below we apply the semantic loss term directly to , i.e., .
If the SL is given large enough weight then it gets closer to the ideal “hard” discriminator, and therefore more strongly encourages the CAN to generate valid structures. Under the preconditions of Theorem 1, it can be shown that for CANs generate valid structures only:
Proposition 1.
Under the assumptions of Corollary 1, CANs associate zero mass to infeasible objects, irrespective of the discrepancy between and .
This holds because any global equilibrium of must minimize the second term. If is nonparametric, then the minimum is attained by or equivalently , which implies , proving the claim. Of course, as with standard GANs, the prerequisites are often violated in practice. Regardless, Proposition 1 works as a sanity check, and shows that, in contrast to GANs, CANs are appropriate for structured generative tasks.
A possible alternative to introduce a differentiable knowledgebased loss into the value function consists in relaxing constraints using fuzzy logic, as done in a number of recent works for deep discriminative learning Donadello et al. (2017); Marra et al. (2019). Apart from lacking a formal derivation in terms of expected probability of satisfying constraints, the issue is that fuzzy logic is not semantically sound, meaning that equivalent encodings of the same constraint may give different loss functions Giannini et al. (2018). Figure 1 illustrates this inconsistency using an XOR constraint: the “fuzzy loss” and its gradient change radically depending on whether the XOR is encoded as CNF (left) or DNF (middle), while the SL is unaffected (right).
Evaluating the Semantic Loss
The sum in Eq. 9 represents the unnormalized probability of sampling a valid configuration from . This evaluation involves computing the Weighted Model Count (WMC) (Chavira and Darwiche, 2008), i.e. the sum of all solutions of , weighted according to their probability with respect to . Naïvely implementing the SL as in Eq. 9 is infeasible in most cases, as it involves summing over exponentially many configurations. Knowledge compilation (KC) (Darwiche and Marquis, 2002)
is a well known approach in automated reasoning and solving WMC through KC is a stateoftheart technique for answering probabilistic queries in many discrete graphical models
(Chavira and Darwiche, 2008; Fierens et al., 2015; Van den Broeck et al., 2011). These techniques work by compiling the problem into a more compact representation and are particularly effective when the logical knowledge doesn’t change through time. As pointed out in (Xu et al., 2018), KC comes to the rescue by making the SL much more efficient to evaluate during training, at the cost of an offline compilation phase.The main downside of KC is that, depending on the complexity of , the compiled circuit may be very large. This is less of an issue during training, which is often performed on powerful machines, but it can be problematic for inference, especially on embedded devices. A major advantage of CANs is that the circuit is not required for inference (as the latter consists of a simple forward pass over the generator), and can thus be thrown away after training. This means that CANs incur no space penalty during inference compared to GANs.
The embedding function
When fed a particularly complex constraint, knowledge compilation may produce a circuit too large even for the training stage. In the case of such an intractable constraint, we approximate the semantic loss by first mapping the objects from to an applicationspecific space where can be expressed in compact form, and then use the semantic loss on top of the transformed objects. We successfully employed this technique to synthesize mario levels where the goal tile is reachable from the starting tile; all details are provided below. The same technique can be exploited for dealing with very complex logical formulas beyond the reach of stateoftheart knowledge compilation.
4.3 Conditional CANs
So far we described how to use the SL for enforcing structural constraints on the generator’s output. Since the SL can be applied to any distribution over binary variables, it can also be used to enforce conditional constraints that can be turned on and off at inference time. Specifically, we notice that the constraint can involve also latent variables, and we show how this can be leveraged for different purposes. Similarly to InfoGANs
Chen et al. (2016), the generator’s input is augmented with an additional binary vector . Instead of maximizing (an approximation of) the mutual information between and the generator’s output, the SL is used to logically bind the input codes to semantic features or constraint of interest. Let be constraints of interest. In order to make them switchable, we extend the latent vector with fresh variables and train the CAN using the constraint:where the prior
used during training is estimated from data.
Using a conditional SL term during training results in a model that can be conditioned to generate object with desired, arbitrarily complex properties at inference time. Additionally, this feature shows a beneficial effect in mitigating mode collapse during training, as reported in Section 5.2.
5 Experiments
Our experimental evaluation aims at answering the following questions:

Can CANs with tractable constraints achieve better results than GANs?

Can CANs with intractable constraints achieve better results than GANs?

Can constraints be combined with rewards to achieve better results than using rewards only?
We implemented CANs using Tensorflow and used PySDD
^{2}^{2}2URL: pypi.org/project/PySDD/ to perform knowledge compilation. We tested CANs using different generator architectures on three realworld structured generative tasks.^{3}^{3}3Details and code can be found in the Supplementary material. In all cases, we evaluated the objects generated by CANs and those of the baselines using three metrics (adopted from Samanta et al. (2018)): validity is the proportion of sampled objects that are valid; novelty is the proportion of valid sampled objects that are not present in the training data; and uniqueness is the proportion of valid unique (nonrepeated) sampled objects.5.1 Super Mario Bros level generation
In this experiment we show how CANs can help in the challenging task of learning to generate videogame levels from userauthored content. While procedural approaches to videogame level generation have successfully been used for decades, the application of machine learning techniques in the creation of (functional) content is a relatively new area of research
(Summerville et al., 2018). On the one hand, modern video game levels are characterized by aesthetical features that cannot be formally encoded and thus are difficult to implement in a procedure, which motivates the use of ML techniques for the task. On the other hand, the levels have often to satisfy a set of functional (hard) constraints that are easy to guarantee when the generator is handcoded but pose challenges for current machine learning models.Architectures for Super Mario Bros level generation include LSTMs (Summerville and Mateas, 2016), probabilistic graphical models (Guzdial and Riedl, 2016), and multidimensional MCMC (Snodgrass and Ontanón, 2016). MarioGANs Torrado et al. (2019) are specifically designed for level generation, but they only constrain the mixture of tiles appearing in the level. This technique cannot be easily generalized to arbitrary constraints.
In the following, we show how the semantic loss can be used to encode useful hard constraints in the context of videogame level generation. These constraints might be functional requirements that apply to every generated object or might be contextually used to steer the generation towards objects with certain properties. In our empirical analysis, we focus on Super Mario Bros (SMB), possibly one of the most studied video games in tilebased level generation.
Recently, Volz et al. (2018) applied Wasserstein GANs (WGANs) (Arjovsky et al., 2017)
to SMB level generation. The approach works by first training a generator in the usual way, then using an evolutionary algorithm called Covariance Matrix Adaptation Evolution Strategy (CMAES) to search for the best latent vectors according to a userdefined fitness function on the corresponding levels. We stress that this technique is orthogonal to CANs and the two can be combined together. We adopt the same experimental setting, WGAN architecture and training procedure of
Volz et al. (2018). The structured objects are tilebased representations of SMB levels (e.g. Fig. 2) and the training data is obtained by sliding a tiles window over levels from the Video game level corpus (Summerville et al., 2016).We run all the experiments on a machine with a single 1080Ti GPU.
5.1.1 CANs with tractable constraints: generating SMB levels with pipes
In this experiment, the focus is on showing how CANs can effectively deal with constraints that can be directly encoded over the generator output. Pipes are made of four different types of tiles. They can have a variable height but the general structure is always the same: two tiles (topleft and topright) on top and one or more pairs of body tiles (bodyleft and bodyright) below (see the CAN  pipes in picture in Fig. 2 for examples of valid pipes). Since encoding all possible dispositions and combinations of pipes in a level would result in an extremely large propositional formula, we apply the constraint locally to a window that is slid, horizontally and vertically, by one tile at a time (notice that all structural properties of pipes are covered using this method). The constraint consists of a lot of implications of the type “if this is a topleft tile, then the tile below must be a bodyleft one” conjoined together (see the Supplementary material for the full formula). The relative importance of the constraints is determined by the hyperparameter (see Eq. 8).
GAN  pipes  CAN  pipes  GAN  playable  CAN  playable 

There are two major problems in the application of the constraint on pipes when using a large : i) vanishing pipes: this occurs because the generator can satisfy the constraint by simply generating layers without pipes; ii) mode collapse: the generator may learn to place pipes always in the same positions. We address both issues by introducing the SL after an initial bootstrap phase (of epochs) in which the generator learns to generate sensible objects, and by linearly increasing its weight from zero to . The final value for was chosen as the highest value allowing to retain al least 80% of pipe tiles on average with respect to a plain GAN. All experiments were run for epochs.
Table 1 reports experimental results comparing GAN and CAN trained on all levels containing pipes. CAN manage to almost double the validity of the generated levels (see the two left pictures in Fig. 2 for some prototypical examples) while retaining about 82% of the pipe tiles and without any significant loss in terms of diversity (as measured by the L1 norm on the difference between each pair of levels in the generated batch) or cost in terms of training (roughly doubled training times). Inference is realtime (< 40 ms) for both architectures.
These results allow to answer Q1 affirmatively.
Model  # Maps  Validity  Avg pipe tiles / level  L1 Norm  Training time 

GAN  7  47.6%  7.8  0.0115  1h 12m 
CAN  7  83.2%  6.4  0.0110  2h 2m 
5.1.2 CANs with intractable constraints: generating playable SMB levels
In the following we show how CANs can be successfully applied in settings where constraints are too complex to be directly encoded onto the generator output. A level is playable if there is a feasible path^{4}^{4}4According to the game’s physics. from the leftmost to the rightmost column of the level. We refer to this property as reachability. We compare CANs with CMAES, as both techniques can be used to steer the network towards the generation of playable levels. In CMAES, the fitness function doesn’t have to be differentiable and the playability is computed on the output of an A* agent (the same used in (Volz et al., 2018)) playing the level. Having the SL to steer the generation towards playable levels is not trivial, since it requires a differentiable definition of playability. Directly encoding the constraint in propositional logic is intractable. Consider the size of a first order logic propositional formula describing all possible path a player can follow in the level. We thus define the playability constraint on the output of an embedding function (modelled as a feedforward NN) that approximates tile reachability. The function is trained to predict whether each tile is reachable from the leftmost column using traces obtained from the A* agent. See the Supplementary material for the details.
Network type  Level  Tested samples  Validity  Training time  Inference time per sample 

GAN  mario13  1000  9.80%  1 h 15 min  40 ms 
GAN + CMAES  mario13  1000  65.90%  1 h 15 min  22 min 
CAN  mario13  1000  71.60%  1 h 34 min  40 ms 
GAN  mario33  1000  13.00%  1 h 11 min  40 ms 
GAN + CMAES  mario33  1000  64.20%  1 h 11 min  22 min 
CAN  mario33  1000  62.30%  1 h 27 min  40 ms 
Table 2 shows the validity of a batch of levels generated respectively by plain GAN, GAN combined with CMAES using the default parameters for the search, and a forward pass of CAN. Each training run lasted epochs with all the default hyper parameters defined in Volz et al. (2018), and the SL was activated from epoch with , which validation experiments showed to be a reasonable tradeoff between SL and generator loss. Results show that CANs achieves better (mario13) or comparable (mario33) validity with respect to GAN + CMAES at a fraction of the inference time. At the cost of pretraining the reachability function, CANs avoid the execution of the A* agent during the generation and sample high quality objects in milliseconds (as compared to minutes), thus enabling applications to create new levels at run time. Moreover, no significant quality degradation can be seen on the generated levels as compared to the ones generated by plain GAN (which on the other hand fails most of the time to generate reachable levels), as can be seen in Fig. 2.
With these results, we can answer Q2 affirmatively.
5.2 Molecule generation
Most approaches in molecule generation use variational autoencoders (VAEs)
(GómezBombarelli et al., 2018; Kusner et al., 2017; Dai et al., 2018; Samanta et al., 2019), or more expensive techniques like MCMC Seff et al. (2019). Closest to CANs are ORGANs (Guimaraes et al., 2017) and MolGANs (De Cao and Kipf, 2018), which respectively combine Sequence GANs (SeqGANs) and Graph Convolutional Networks (GCNs) with a reward network that optimizes specific chemical properties. Albeit comparing favorably with both sequence models (Jaques et al., 2017; Guimaraes et al., 2017) (using SMILE representations) and likelihoodbased methods, MolGAN are reported to be susceptible to mode collapse.In this experiment, we investigate Q3 by combining MolGAN’s adversarial training and reinforcement learning objective with a conditional SL term on the task of generating molecules with certain desirable chemical properties.
In contrast with our previous experimental settings, here the structured objects are undirected graphs of bounded maximum size, represented by discrete tensors that encode the atom/node type (padding atom (no atom), Carbon, Nitogren, Oxygen, Fluorine) and the bound/edge type (padding bond (no bond), single, double, triple and aromatic bond). During training, the network implicitly rewards validity and the maximization of the three chemical properties at once:
QED (druglikeness), SA (synthesizability) and logP (solubility). The training is stopped once the uniqueness drops under . We augment the MolGAN architecture with a conditional SL term, making use of latent dimensions to control the presence of one of the types of atoms considered in the experiment, as shown in Section 4.3.Conditioning the generation of molecules with specific atoms at training time mitigates the drop in uniqueness caused by the reward network during the training. This allows the model to be trained for more epochs and results in higher quality molecules, as reported in Table 3. ^{5}^{5}5
The experimental setting and evaluation metrics are identical to
De Cao and Kipf (2018) except for the introduction of the SL, we thus report the same results for the baseline.In this experiment, we train the model on a NVIDIA RTX 2080 Ti. The total training time is around 1 hour, and the inference is realtime. Using CANs produced a negligible overhead during the training with respect to the original model, providing further evidence that the technique doesn’t heavily impact on the training.
This results suggest that coupling CANs with a reinforcement learning objective is beneficial, answering Q3 affirmatively.
Reward for  SL  validity  uniqueness  diversity  QED  SA  logP 

QED + SA + logP  False  97.4  2.4  91.0  47.0  84.0  65.0 
True  96.6  2.5  98.8  51.8  90.7  73.6 
6 Conclusion
We presented Constrained Adversarial Networks (CANs), a generalization of GANs in which the generator is encouraged during training to output valid structures. CANs make use of the semantic loss (Xu et al., 2018) to penalize the generator proportionally to the mass it allocate to invalid structures and. As in GANs, generating valid structures (on average) requires a simple forward pass on the generator. Importantly, the data structures used by the SL, which can be large if the structural constraints are very complex, are discarded after training. CANs were proven to be effective in improving the quality of the generated structures without significantly affecting inference runtime, and conditional CANs proved useful in promoting diversity of the generator’s outputs.
Broader Impact
Broadly speaking, this work aims at improving the reliability of structures / configurations generated via machine learning approaches. This can have a strong impact on a wide range of research fields and application domains, from drug design and protein engineering to layout synthesis and urban planning. Indeed, the lack of reliability of machinegenerated outcomes is one of main obstacles to a wider adoption of machine learning technology in our societies. On the other hand, there is a risk of overestimating the reliability of the outputs of CANs, which are only guaranteed to satisfy constraints in expectation. For applications in which invalid structures should be avoided, like safetycritical applications, the objects output by CANs should always be validated before use.
From an artificial intelligence perspective, this work supports the line of thought that in order to overcome the current limitations of AI there is a need for combining machine learning and especially deep learning technology with approaches from knowledge representation and automated reasoning, and that principled ways to achieve this integration should be pursued.
References
 Wasserstein generative adversarial networks. In International conference on machine learning, pp. 214–223. Cited by: §3, §5.1, Super Mario Bros Level Generation.
 On probabilistic inference by weighted model counting. Artificial Intelligence 172 (67), pp. 772–799. Cited by: §4.2.
 Infogan: interpretable representation learning by information maximizing generative adversarial nets. In Advances in neural information processing systems, pp. 2172–2180. Cited by: §4.3.
 Syntaxdirected variational autoencoder for molecule generation. In Proceedings of the International Conference on Learning Representations, Cited by: §5.2.
 A knowledge compilation map. Journal of Artificial Intelligence Research 17, pp. 229–264. Cited by: §2, §4.2.
 MolGAN: An implicit generative model for small molecular graphs. arXiv preprint arXiv:1805.11973. Cited by: §2, §5.2, Molecule Generation, footnote 5.
 Logic tensor networks for semantic image interpretation. Cited by: §2, §4.2.

Inference and learning in probabilistic logic programs using weighted boolean formulas
. Theory and Practice of Logic Programming 15 (3), pp. 358–401. Cited by: §4.2.  On a convex logic fragment for learning and reasoning. IEEE Transactions on Fuzzy Systems 27 (7), pp. 1407–1416. Cited by: §4.2.
 Automatic chemical design using a datadriven continuous representation of molecules. ACS central science 4 (2), pp. 268–276. Cited by: §5.2.
 Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §1, §3, §3, §4.2, Theorem 1.
 Objectivereinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv preprint arXiv:1705.10843. Cited by: §2, §5.2.
 Game level generation from gameplay videos. In Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference, Cited by: §5.1.
 Deep generative models with learnable knowledge constraints. In Advances in Neural Information Processing Systems, pp. 10501–10512. Cited by: §2.
 Sequence tutor: conservative finetuning of sequence generation models with klcontrol. In Proceedings of the 34th International Conference on Machine LearningVolume 70, pp. 1645–1654. Cited by: §5.2.
 Progressive growing of gans for improved quality, stability, and variation. Cited by: §1.
 Probabilistic sentential decision diagrams. In Fourteenth International Conference on the Principles of Knowledge Representation and Reasoning, Cited by: §2.
 Probabilistic graphical models: principles and techniques. MIT press. Cited by: §2.
 Grammar variational autoencoder. In Proceedings of the 34th International Conference on Machine LearningVolume 70, pp. 1945–1954. Cited by: §5.2.
 Prediction of protein residue contacts by markov logic networks with groundingspecific weights. Bioinformatics 25 (18), pp. 2326–2333. Cited by: §2.
 DeepProbLog: Neural probabilistic logic programming. In Advances in Neural Information Processing Systems, pp. 3749–3759. Cited by: §2.
 LYRICS: a General Interface Layer to Integrate AI and Deep Learning. arXiv preprint arXiv:1903.07534. Cited by: §2, §4.2.
 Neural markov logic networks. arXiv preprint arXiv:1905.13462. Cited by: §2.
 Which training methods for gans do actually converge?. arXiv preprint arXiv:1801.04406. Cited by: §3.
 Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957. Cited by: §3.
 fGAN: Training generative neural samplers using variational divergence minimization. In Advances in neural information processing systems, pp. 271–279. Cited by: §3.

Sumproduct networks: a new deep architecture.
In
2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)
, pp. 689–690. Cited by: §2.  Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data 1. Cited by: Molecule Generation.
 Endtoend differentiable proving. In Advances in Neural Information Processing Systems, pp. 3788–3800. Cited by: §2.
 Enumeration of 166 billion organic small molecules in the chemical universe database gdb17. Journal of Chemical Information and Modeling 52 (11), pp. 2864–2875. Note: PMID: 23088335 External Links: Document, Link, https://doi.org/10.1021/ci300415d Cited by: Molecule Generation.
 Improved techniques for training gans. In Advances in neural information processing systems, pp. 2234–2242. Cited by: §3.
 Nevae: a deep generative model for molecular graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 1110–1117. Cited by: §5.2.
 Designing random graph models using variational autoencoders with applications to chemical design. arXiv preprint arXiv:1802.05283. Cited by: §5.
 Discrete object generation with reversible inductive construction. arXiv preprint arXiv:1907.08268. Cited by: §5.2.

Controllable procedural content generation via constrained multidimensional markov chain sampling.
. In IJCAI, pp. 780–786. Cited by: §5.1.  The vglc: the video game level corpus. arXiv preprint arXiv:1606.07487. Cited by: §5.1.
 Super mario as a string: platformer level generation via lstms. arXiv preprint arXiv:1603.00930. Cited by: §5.1.
 Procedural content generation via machine learning (pcgml). IEEE Transactions on Games 10 (3), pp. 257–270. Cited by: §5.1.
 Learning design patterns with bayesian grammar induction. In Proceedings of the 25th annual ACM symposium on User interface software and technology, pp. 63–74. Cited by: §2.
 Bootstrapping conditional gans for video game level generation. arXiv preprint arXiv:1910.01603. Cited by: §2, §5.1.
 Lifted probabilistic inference by firstorder knowledge compilation. In Proceedings of the TwentySecond international joint conference on Artificial Intelligence, pp. 2178–2185. Cited by: §4.2.
 On the uniform convergence of relative frequencies of events to their probabilities. In Measures of complexity, pp. 11–30. Cited by: §4.1.

Evolving mario levels in the latent space of a deep convolutional generative adversarial network.
In
Proceedings of the Genetic and Evolutionary Computation Conference
, pp. 221–228. Cited by: §5.1.2, §5.1.2, §5.1, Super Mario Bros Level Generation.  A semantic loss function for deep learning with symbolic knowledge. In International Conference on Machine Learning, pp. 5498–5507. Cited by: §1, §4.2, §4.2, §6.
 Embedding decision diagrams into generative adversarial networks. In International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, pp. 616–632. Cited by: §2.
 Stackgan: text to photorealistic image synthesis with stacked generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pp. 5907–5915. Cited by: §1.

Unpaired imagetoimage translation using cycleconsistent adversarial networks
. In Proceedings of the IEEE international conference on computer vision, pp. 2223–2232. Cited by: §1.
Supplementary Material: implementation details
Super Mario Bros Level Generation
The deep neural network for this experiment is based on the DCGANs used in [Volz et al., 2018]
. Batch normalization and ReLU are applied between the layers of the generator
, while batch normalization and Leaky ReLU with a slope of has been used for the discriminator . In the last layer of the generator we apply a softmaxactivation function to obtain probabilities that are finally given in input to the Semantic Loss. On the other hand, the generation of samples is done through the application, always on , of a stretched softmax function followed by an argmax, as in [Volz et al., 2018].The networks have been trained using the WGAN guidelines [Arjovsky et al., 2017]. Thus, the number of iterations on the discriminator has been set to for each iteration on the generator. RMSProp has been used as optimizer, with a constant learning rate equal to . The batch size used during the experiments has been set to . Layers have been initialized using normal initializer for both the generator and the discriminator. Moreover, weight clipping is applied on the weights of with equal to . Finally, the size of the latent vector has been set to
and sampled from a normal distribution
Table 4 shows the network architecture of and .
Part  Input Shape Output Shape  Layer Type  Kernel  Stride 
(32) (1, 1, 32)  Reshape.      
(1, 1, 32) (, , 16)  Deconv.  
(, , 16) (, , 8)  Deconv.  
(, , 8) (, , 4)  Deconv.  
(, , 4) (, , 13)  Deconv.  
(, , 13) (, , 64)  Conv.  
(, , 64) (, , 128)  Conv.  
(, , 128) (, , 256)  Conv.  
(, , 256) (1, 1, 1)  Conv. 
Details about the pipes constraint
As reported in the main paper, experiment with the constraint on pipes has been run for epochs. Figure 3 shows the various parts composing a pipe and their disposition. Suppose to call the matrix boolean variables corresponding to the output of the generator with shape . Remember that we apply the constraint separately to windows of size , with each pixel having channels. The four channels represent the probabilities of the tiles: [topleft, topright, bodyleft, bodyright, others]. In particular, in the last channel we collapse all the probabilities of the tiles that do not belong to pipes (air, monsters, walls, …). Then, given the boolean vector, the list of the clauses composing the final constraint can be written as:
topleft tile requires topright tile on the right and viceversa  
bodyleft tile requires bodyright tile on the right and viceversa  
topleft tile requires bodyleft tile below  
topright tile requires bodyright tile below  
bodyleft tile requires bodyleft of topleft above  
bodyright tile requires bodyright of topright above  
One hot encoding over all the 4 positions 
Notice that first two indexes describe the position, e.g. means the upper left corner of the window, and the third index defines the tile type.
Details about the reachability constraint
The feedforward neural network is implemented using a CNN with two final dense layers, which architecture is described in Table 5.
Input Shape Output Shape  Layer Type  Kernel  Stride 

(, , 13) (, , 8)  Conv.  
(, , 8) (, , 16)  Conv.  
(, , 16) (, , 24)  Conv.  
(, , 24) (, , 32)  Conv.  
(, , 32) (, , 64)  Conv.  
(, , 64) (, , 96)  Conv.  
(, , 96) (, , 128)  Conv.  
(, , 128) (, , 192)  Conv.  
(, , 192) (, , 32)  Dense     
(, , 32) (, , 2)  Dense     
Performances of the approximation network include an accuracy and an F1score higher than . Picture 4 shows examples of how reachability maps have been approximated with . The first column contains levels generated by the GAN. The binary maps have been obtained by summing the probabilities of all the solid tiles (ground, pipes, …), given in white. The second column contains the reachability maps, computed by the A* agent and averaged over different runs. Finally, the third column shows the reachability maps approximated by the neural network. Notice how well does work: the second and third columns are almost indistinguishable. and are such that and
Molecule Generation
The MolGAN architecture is composed of three networks, the generator
, the discriminator and the reward network . and
share the same architecture, but are trained with different
objectives. is trained to predict the product of the QED, SA, logP
metrics (in ). is optimized to produce samples that
maximize the output of and are convincing to , on top of that,
the conditional semantic loss is also applied.
While is trained in parallel with and , the loss of
with respect to the output of is activated only after 150 epochs,
at which point the adversarial loss stops being used. The semantic
loss is applied to from the start to the end of the training. The
weight of the semantic loss is equal to , whereas the adversarial
loss of has a weight of .
Improved WGAN is used as the adversarial loss between and
( = 5), whereas is trained to estimate the desired
target by mean squared error, and is trained with deep
deterministic policy gradient w.r.t. the output of , which is seen
as a reward to maximize.
Training proceeds until the uniqueness of the batch falling below
. During training, the learning rate is set at a constant value
of , the batch size is and there is no dropout. Adam with
= 0.9 and = 0.999 is the optimizer of choice. Batch
discrimination is used.
Results are obtained by evaluating a batch of generated
samples.
The input noise
has dimension 32 if the semantic loss is not applied, 36 if it is applied; with the first 32 dimensions sampled from a standard normal distribution and the last four from a uniform distribution in
.The maximum number of nodes for each molecule is 9, with 5 possible atom types and 5 bond types. Each molecule is represented by two matrices, one mapping each node to a label, and one adjacency matrix informing about the presence or lack of edges between nodes, and their type.
The input noise is received by and processed by three fully connected layers of units each, while acts as the activation function; a linear projection followed by a softmax is then applied to the output of the last layer to have it matching the size of the adjacency matrices.
and share the same architecture (no parameters are shared), based on two relational graph convolutions [De Cao and Kipf, 2018] of hidden units, followed by an aggregation as in [De Cao and Kipf, 2018] to obtain a graph level representation of 128 features. Two fully connected layers of units then reduce the graph embedding to a single output value, with as the activation for the hidden layer and with sigmoid being applied on the output in the case of . The MolGAN architecture is trained on the QM9 dataset [Ramakrishnan et al., 2014, Ruddigkeit et al., 2012] composed of training examples.