Constrained Image Generation Using Binarized Neural Networks with Decision Procedures

by   Svyatoslav Korneev, et al.

We consider the problem of binary image generation with given properties. This problem arises in a number of practical applications, including generation of artificial porous medium for an electrode of lithium-ion batteries, for composed materials, etc. A generated image represents a porous medium and, as such, it is subject to two sets of constraints: topological constraints on the structure and process constraints on the physical process over this structure. To perform image generation we need to define a mapping from a porous medium to its physical process parameters. For a given geometry of a porous medium, this mapping can be done by solving a partial differential equation (PDE). However, embedding a PDE solver into the search procedure is computationally expensive. We use a binarized neural network to approximate a PDE solver. This allows us to encode the entire problem as a logical formula. Our main contribution is that, for the first time, we show that this problem can be tackled using decision procedures. Our experiments show that our model is able to produce random constrained images that satisfy both topological and process constraints.



There are no comments yet.


page 1

page 2

page 3

page 4


A Latent space solver for PDE generalization

In this work we propose a hybrid solver to solve partial differential eq...

Physics-informed neural networks with hard constraints for inverse design

Inverse design arises in a variety of areas in engineering such as acous...

Deep connections between learning from limited labels physical parameter estimation – inspiration for regularization

Recently established equivalences between differential equations and the...

PDE-constrained Models with Neural Network Terms: Optimization and Global Convergence

Recent research has used deep learning to develop partial differential e...

A connection between topological ligaments in shape optimization and thin tubular inhomogeneities

In this note, we propose a formal framework accounting for the sensitivi...

Data-Driven Shadowgraph Simulation of a 3D Object

In this work we propose a deep neural network based surrogate model for ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider the problem of constrained image generation of a porous medium with given properties. Porus media occur, e.g., in lithium-ion batteries and composed materials [1, 2]; the problem of generating porus media with a given set of properties is relevant in practical applications of material design [3, 4, 5]. Artificial porous media are useful during the manufacturing process as they allow the designer to synthesize new materials with predefined properties. For example, generated images can be used in designing a new porous medium for an electrode of lithium-ion batteries. It is well-known that ions macro-scale transport and reactions rates are sensitive to the topological properties of the porous medium of the electrode. Therefore, manufacturing the porous electrode with given properties allows improving the battery performance [1].

Images of porous media111Specifically, we are looking at a transitionally periodic “unit cell” of porous medium assuming that porous medium has a periodic structure [5]. are black and white images that represent an abstraction of the physical structure. Solid parts (or so called grains) are encoded as a set of connected black pixels; a void area is encoded a set of connected white pixels. There are two important groups of restrictions that images of a porous medium have to satisfy. The first group constitutes a set of “geometric” constraints that come from the problem domain and control the total surface area of grains. For example, an image contains two isolated solid parts. Figure 1(a) shows examples of 16x16 images from our datasets with two (the top row) and three (the bottom row) grains.

Figure 1: (a) Examples of images from train sets with two and three grains; (b) Examples of images generated by a GAN on the dataset with two grains. Examples of generated images with (c) , (d) , and (e) .

The second set of restrictions comes from the physical process that is defined for the corresponding porous medium. In this paper, we consider the macro-scale transportation process that can be described by a set of dispersion coefficients depending on the transportation direction. For example, we might want to generate images that have two grains such that the dispersion coefficient along the -axis is between 0.5 and 0.6. The dispersion coefficient is defined for the given geometry of a porous medium. It can be obtained as a numerical solution of the diffusion Partial Differential Equation (PDE). We refer to these restrictions on the parameters of the physical process as process constraints.

The state of the art approach to generating synthetic images is to use generative adversarial networks (GANs) [6]. However, GANs are not able learn geometric, three-dimensional perspective, and counting constraints which is a known issue with this approach [7, 8]

. Our experiments with GAN-generated images also reveal this problem. There are no methods that allow embedding of declarative constraints in the image generation procedure at the moment.

In this work we show that the image generation problem can be solved using decision procedures for porous media. We show that both geometric and process constraints can be encoded as a logical formula. Geometric constraints are encoded as a set of linear constraints. To encode process constraints, we first approximate the diffusion PDE solver with a Neural Network(NN) [9, 10]. We use a special class of NN, called Bnn, as these networks can be encoded as logical formulas. Process constraints are encoded as restrictions on outputs of the network. This provides us with an encoding of the image generation problem as a single logical formula. The contributions of this paper can be summarized as follows: (i) We show that constrained image generation can be encoded as a logical formula and tackled using decision procedures. (ii) We experimentally investigate a GAN-based approach to constrained image generation and analyse their advantages and disadvantages compared to the constraint-based approach. (iii) We demonstrate that our constraint-based approach is capable of generating random images that have given properties, i.e., satisfy process constraints.

2 Problem description

We describe a constrained image generation problem. We denote an image that encodes a porous medium and

a vector of parameters of the physical process defined for this porous material. We use an image and a porous medium interchangeably to refer to

. We assume that there is a mapping function M that maps an image to the corresponding parameters vector , . We denote as the geometric constraints on the structure of the image and as the process constraints on the vector of parameters . Given a set of geometric and process constraints and a mapping function M, we need to generate a random image that satisfies and . Next we overview geometric and process constraints and discuss the mapping function.

The geometric constraints define a topological structure of the image. For example, they can ensure that a given number of grains is present on an image and these grains do not overlap. Another type of constraints focuses on a single grain. They can restrict the shape of a grain, e.g., a convex grain, its size or position on the image. The third type of constraints are boundary constraints that ensure that the boundary of the image must be in a void area. Process constraints define restrictions on the vector of parameters. For example, we might want to generate images with , .

Next we consider a mapping function M. A standard way to define M is by solving a system of partial differential equations. However, solving these PDEs is a computationally demanding task and, more importantly, it is not clear how to ‘reverse’ them to generate images with given properties. Hence, we take an alternative approach of approximating a PDE solver using a neural network [9, 10]. To train such an approximation, we build a training set of pairs , , where is an input of the network and , obtained by solving the PDE given , is its label. In this work, we use a special class of deep neural networks — binarized neural networks (Bnn) that admit an exact encoding into a logical formula. We assume that M is represented as a Bnn and is given as part of input. We will elaborate on the training procedure in Section 5.

3 The generative neural network approach

One approach to tackle the constrained image generation problem is to use generative adversarial networks (GANs) [6, 11]. GANs are successfully used to produce samples of realistic images for commonly used datasets, e.g. interior design, clothes, animals, etc. A GAN can be described as a game between the image generator that produces synthetic (fake) images and a discriminator that distinguishes between fake and real images. The cost function is defined in such a way that the generator and the discriminator aim to maximize and minimize this cost function, respectively, turning the learning process into a minimax game between these two players. Each payer is usually represented as a neural network. To apply GANs to our problem, we take a set of images and pass them to the GAN. These images are samples of real images for the GAN. After the training procedure is completed, the generator network produces artificial images that look like real images. The main advantage of GANs is that it is a generic approach that can be applied to any type of images and can handle complex concepts, like animals, scenes, etc.222GANs exhibit well-known issues with poor convergence that we did not observe as our dataset is quite simple [12]. However, the main issue with this approach is that there is no way to explicitly pass declarative constraints into the training procedure. One might expect that GANs are able to learn these constraints from the set of examples. However, this is not the case at the moment, e.g., GANs cannot capture counting constraints, like four legs, two eyes, etc. [7]. Figure 1 shows examples of images that GAN produces on a dataset with two grains per image. As can be seen from these examples, GAN produces images with an arbitrary number of grains between 1 and 5 per image. In some simple cases, it is easy to filter wrong images. If we have more sophisticated constraints like convexity or size of grains, then most images will be invalid. On top of this, to take into account process constraints, we need additional restrictions on the training procedure. Overall, it is an interesting research question how to extend the GAN training procedure with physical constraints, which is beyond the scope of this paper [13]. Next we consider our approach to the image generation problem.

4 The constraint-based approach

The main idea behind our approach is to encode the image generation problem as a logical formula. To do so, we need to encode all problem constraints and the mapping between an image and its label as a set of constraints. We start with constraints that encode an approximate PDE solver. We denote a range of numbers from to .

4.1 Approximation of a PDE solver.

One way to approximate a diffusion PDE solver is to use a neural network [9, 10]. A neural network is trained on a set of binary images and their labels , . During the training procedure, the networks takes an image

as an input and outputs its estimate of the parameter vector

. As we have ground truth parameters for each image, we can use the mean square error or absolute value error as a cost function to perform optimization [14]. In this work, we take the same approach. However, we use a special type of networks: Binarized Neural Networks (Bnn). Bnn is a feedforward network where weights and activations are binary [15]. It was shown in [14, 16] that Bnn

s allow exact encoding as logical formulas, namely, they can be encoded a set of reified linear constraints over binary variables. We use

Bnns as they have a relatively simple structure and decision procedures scale to reason about small and medium size networks of this type. In theory, we can use any exact encoding to represent a more general network, e.g., MILP encodings that are used to check robustness properties of neural networks [17, 18]. However, the scalability of decision procedures is the main limitation in the use of more general networks. We use the ILP encoding as in [14] with a minor modification of the last layer as we have numeric outputs instead of categorical outputs. We denote a logical formula that encodes Bnn using reified linear constraints over Boolean variables (Section 4, ILP encoding [14]).

4.2 Geometric and process constraints.

Geometric constraints can be roughly divided into three types. The first type of constraints defines the high-level structure of the image. The high-level structure of our images is defined by the number of grains present in the image. Let be the number of grains per image. We define a grid of size . Figure 2(a) shows an example of a grid of size . We refer to a cell on the grid as a pixel as this grid encodes an image of size . Next we define the neighbor relation on the grid. We say that a cell is a neighbour of if these cells share a side. For example, is a neighbour of as the right side of is shared with . Let be the set of neighbors of on the gird. For example, .

Figure 2: Illustrative examples of additional structures used by constraint-based model.


For each cell we introduce a Boolean variable , , . iff the cell belongs to the th grain, . Similarly, iff the cell represents a void area.

Each cell is either a black or white pixel.

We enforce that each cell contains either a grain or a void area.


Grains do not overlap.

Two cells that belong to different grains cannot be neighbours.


Grains are connected areas.

We enforce connectivity constraints for each grain. By connectivity we mean that there is a path between two cells of the same grain using only cells that belong to this grain. Unfortunately, enforcing connectivity constraints is very expensive. Encoding the path constraint results in a prohibitively large encoding. To deal with this explosion, we restrict the space of possible grain shapes. First, we assume that we know the position of one pixel of this grain that we pick randomly. Let be a random cell, . Then we implicitly build a directed acyclic graph (DAG) starting from this cell that covers the entire grid. Each cell of a grid is a node in this graph. The node that corresponds to the cell does not have incoming arcs. There are multiple ways to build a from . Figure 2(a) and (d) show two possible ways to build a DAG that covers a grid starting from cell . Next we define a parent relation in . Let be the set of parents of cell in . For example, in our example on Figure 2(a). Given a DAG , we can easily enforce connectivity relation w.r.t. . The following constraint ensures that a cell belongs to the th grain iff one of its parents in belongs to the same grain. Moreover, by enforcing connectivity constraints on the void area, we make sure that grains do not contain isolated void areas inside them.


Given a DAG , we can generate grains of multiple shapes. For example, Figure 2(b) shows one possible grain. However, we also lose some valid shapes that are ruled out by the choice of graph . For example, Figure 2(c) gives an example of a shape that is not possible to build using in Figure 2(a). However, if we select a different random DAG , e.g., Figure 2(d), then this shape is one of the possible shapes for . In general, we can pick and DAG randomly, it is possible to generate a variety of shapes.

Compactness of a grain.

The second set of constraints is about restrictions on a single grain. The compactness constraint is a form of convexity constraint. We want to ensure that any two boundary points of a grain are close to each other. The reason for this constraint is that grains are unlikely to have a long snake-like appearance as solid particles tend to group together. Sometimes, we need to enforce the convexity constraint, which is an extreme case of compactness. To enforce this constraint, we again trade-off the variety of shapes and the size of the encoding. Now we assume that is the center of the grain. Then we build virtual circles around this center that cover the grid. Figure 2(e) shows examples of such circles. Let be a set of circles that are built with the cell as a center. The following constraint enforces that a cell that belongs to the circle can be in the th grain iff all cells from the inner circle belong to the th grain, where is a parameter.


Note that if then we generate convex grains. In this case, every pixel from has to belong to the th grain before we can add a pixel from the circle to this grain.

Boundary constraints.

We also have a technical constraint that all cells on the boundary of the grid must be void pixels. They are required to define boundary conditions for PDEs on generated images.


Connecting with Bnn.

We need to connect variables with the inputs of the network.


Process constraints.

Process constraints are enforced on the output of the network. Given ranges , we have:



To solve the constrained random image generation problem, we solve the conjunctions of constraints (1)–(7) together with our ILP encoding . Randomness comes from the random seed that is passed to the solver, a random choice of and .

5 Experiments

We conduct a set of experiments with our constraint based approach. We ran our experiments on Intel(R) Xeon(R) 3.30GHz. We use the timeout of 600 sec in all runs.

Training procedure.

We use two datasets, with 10K images and with 5K images. Each image in contains two grains and each image in contains three grains. These images were labeled with dispersion coefficients along the -axis which is a number between 0.4 and 1. We performed quantization on the dispersion coefficient value to map into an interval of integers between and . We use mean absolute error () to train Bnn. Bnn

consists of three blocks with 100 neurons per layers and one output. The

is 4.2 for and 5.1 for . We lose accuracy compared to non-binarized networks, e.g, for the same non-binarized network is 2.5 for . However, Bnns are much easier to reason about, so we work with this subclass of networks.

Image generation.

We use CPLEX and the SMT solver Z3 to solve instances produced by constraints (1)–(7) together with . In principle, other solvers could be evaluated on these instances. The best mode for Z3 was to use an SMT core based on CDCL and a theory solver for nested Pseudo-Boolean and cardinality constraints. We noted that bit-blasting into sorting circuits did not scale, and Z3’s theory of linear integer arithmetic was also inadequate. We considered six process constraints for , namely, , . For each interval , we generate 100 random constrained problems. The randomization comes from a random seed that is passed to the solver, the position of centers of each grain and the parameter in the constraint (4). We used the same DAG construction as in Figure 2(a) in all problems. Table 1 shows summary of our results for CPLEX and Z3 solvers. As can be seen from this table, these instances are relatively easy for the CPLEX solver. It can solve most of them within the given timeout. The average time for is 25s and for is 12s with CPLEX. Z3 handles most benchmarks, but we observed it gets stuck on examples that are very easy for CPLEX, e.g. the interval for . We hypothesize that this is due to how watch literals are tracked in a very general way on nested cardinality constraints (Z3 maintains a predicate for each nested PB constraint and refreshes the watch list whenever the predicate changes assignment), when one could instead exploit the limited way that CPLEX allows conditional constraints. The average time for is 94s and for is 64s with Z3.

Figure 3: The absolute error between and its true value.

Figures 1(c)–(e) show examples of generated images for ranges , and for (the top row) and (the bottom row). For the process we consider, as the value of the dispersion coefficient grows, the black area should decrease as there should be fewer grain obstacles for a flow to go through the porous medium. Indeed, images in Figures 1(c)–(e) follow this pattern, i.e. the black area on images with is significantly larger than on images with . Moreover, by construction, they satisfy geometric constraints that GANs cannot handle. For each image we generated, we run a PDE solver to compute the true value of the dispersion coefficient on this image. Then we compute the absolute error between the value of that our model computes and the true value of the coefficient. Figure 3 shows absolute errors for all benchmarks that were solved by CPLEX. First, this figure shows that our model generates images with given properties. The mean absolute error is about 10 on these instances. Taking into account that Bnn has of 4.2 on , of 10 on new generated instances is a reasonable result. Ideally, we would like to be zero. However, this error depends purely on the Bnn we used. To reduce this error, we need to improve the accuracy of Bnn as it serves as an approximator of a PDE solver. For example, we can use more binarized layers or use additional non-binarized layers. Of course, increasing the power of the network leads to computational challenges solving the corresponding logical formulas.

[40,50) [50,60) [60,70) [70,80) [80,90) [90,100] [40,50) [50,60) [60,70) [70,80) [80,90) [90,100]
CPLEX 100 99 99 98 100 41 100 100 96 99 100 84
Z3 98 89 81 74 56 12 100 97 97 97 96 54
Table 1: The number of solved instances in each interval .

6 Related work

There are two lines of work related to our paper. The first one uses constraint to enhance machine learning techniques with declarative constraints, e.g. in solving constrained clustering problems and in data mining techniques that handle domain specific constraints 

[19, 20, 21]. One recent example is the work of Ganji et al. [20] who proposed a logical model for constrained community detection. The second line of research explores embedding of domain-specific constraints in the GAN training procedure [13, 22, 23, 8, 24]. Work in this area is targeting various applications in physics and medicine that impose constraints, like sparsity constraints, high dynamic range requirements (e.g. when pixel intensity in an image varies by orders of magnitude), location specificity constraints (e.g. shifting pixel locations can change important image properties), etc. However, this research area is emerging and the results are still preliminary.

7 Conclusion

In this paper we considered the constrained image generation problem for a physical process. We showed that this problem can be encoded as a logical formula over Boolean variables. For small porous media, we show that the generation process is computationally feasible for modern decision procedures.There are a lot of interesting future research directions. First, the main limitation of our approach is scalability, as we cannot use large networks with a number of weights in the order of hundreds of thousands, as it is required by industrial applications. However, constraints that are used to encode, for example, binarized neural networks are mostly pseudo-Boolean constraints with unary coefficients. Hence, it would be interesting to design specialized procedures to deal with this fragment of constraints. Second, we need to investigate different types of neural networks that admit encoding into SMT or ILP. For instance, there is a lot of work on quantized networks that use a small number of bits to encode each weight, e.g. [25]. Finally, can we use similar techniques to reveal vulnerabilities in neural networks? For example, we might be able to generate constrained adversarial examples or other special types of images that expose undesired network behaviour.