Interior Point Methods with Adversarial Networks

05/23/2018 ∙ by Rafid Mahmood, et al. ∙ York University 0

We present a new methodology, called IPMAN, that combines interior point methods and generative adversarial networks to solve constrained optimization problems with feasible sets that are non-convex or not explicitly defined. Our methodology produces ϵ-optimal solutions and demonstrates that, when there are multiple global optima, it learns a distribution over the optimal set. We apply our approach to synthetic examples to demonstrate its effectiveness and to a problem in radiation therapy treatment optimization with a non-convex feasible set.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A constrained optimization problem involves the minimization of an objective function subject to constraints that limit the set of possible feasible solutions. Training a neural network to solve constrained optimization problems in a

generative

way, (i.e., without a training set of problem parameters and optimal solutions) is as old as the field of deep learning itself 

[Hopfield and Tank, 1985]. Typically, the objective is modified by adding a barrier function that penalizes the objective value when solutions do not satisfy the constraints; the modified objective is then approximated with a neural network. This approach parallels interior point methods (IPMs), which are a family of classical operations research techniques for constrained optimization. In IPMs, a differentiable barrier function is first constructed before using gradient descent on the modified objective [Nemirovski and Nesterov, 1994]. IPMs have become indispensable for large-scale linear and quadratic convex optimization problems, due not only to complexity guarantees, but also because the number of iterations required scales gracefully with the problem size [Gondzio, 2012]. However, a drawback of IPMs is that the problem needs to be well-defined. Specifically, to obtain these complexity guarantees, the objective function and all the constraints must be known a priori and the feasible set must be well-behaved, e.g., convex [Bubeck and Eldan, 2014]. Furthermore, these methods typically search for a single optimal solution, while in practice, multiple optimal solutions may exist and need to be determined.

In this paper, we study the task of generatively learning the optimal solution set of a constrained optimization problem, including situations where the feasible set cannot be explicitly defined. Our method combines ideas from IPMs and generative adversarial learning [Goodfellow et al., 2014]. We consider a general class of problems where constraints may be non-convex or even not explicitly defined. Instead, the feasible region is learned via a dataset of given feasible solutions, modeling the situation where we observe actions of decision makers who may be adhering to private, non-observable constraints. Analogous to IPMs, which use a barrier function to enforce feasibility and find an iteratively improving solution, our generative adversarial network (GAN) trains a discriminator to act as a barrier function and a generator to return improving solutions. We prove that our approach recovers analogous theoretical guarantees of IPMs. Moreover, our methodology allows the generator network to simultaneously learn the entire set of optimal solutions rather than converging to a single point. This is particularly important if decision makers wish to explore the Pareto frontier of optimal solutions to consider various trade-offs before implementing a particular optimal solution.

Our work is motivated by a planning problem in radiation therapy (RT) treatment optimization. RT is used to treat more than 50% of cancer patients worldwide [Delaney et al., 2005]. The current treatment optimization paradigm consists of a human-driven, iterative process of solving an approximation to the true non-convex, constrained optimization problem. That is, while a treatemtn planner generates deliverable treatments, an oncologist provides feedback on clinical acceptability until final approval is given. This iterative process traverses the Pareto frontier of deliverable plans while the planner simultaneously learns the oncologists private preferences regarding the feasible set. As this process often requires several iterations over a span of days, there is significant interest in automating the pipeline using procedurally generated plans [McIntosh and Purdie, 2017]. The most common approach is to solve a convex approximation to the problem, and then measure the quality of the resulting solution against the true non-convex measures [Babier et al., 2018]. We propose a methodology that can directly produce deliverable plans. In our numerical results, we show that it is possible to generate high-quality solutions with these non-convex measures incorporated into a simplified treatment optimization model. To our knowledge, this is the first approach to RT treatment planning that directly tackles the core non-convexity of the problem using a deep learning approach.

Our specific contributions are as follows:

  1. We develop a novel approach to solving non-convex optimization problems, using a data-only specification of the feasible set. Our approach can generate the entire optimal solution set, a critical task when decision makers are interested in evaluating multiple optimal solutions.

  2. We provide the first integration of methodology from the domains of IPMs and GANs. Our proposed algorithm, IPMAN, replaces the traditional barrier with a discriminator and the optimal solution with the generator distribution. Our approach also provides a theoretical guarantee that by solving our modified problem with the discriminator as the barrirer, any resulting solution is -optimal with respect to the original, possibly non-convex problem.

  3. We apply IPMAN algorithm to a problem in radiation therapy treatment planning. We demonstrate that the generated plans satisfy the non-convex problem constraints and, by analyzing various objectives, grade better as compared to a training set of deliverable plans.

2 Related Work

2.1 Interior Point Methods

Interior point methods (IPMs), or barrier methods, are standard techniques for solving constrained optimization problems [Nemirovski, 2004], defined by an objective function and a feasible set

in which the optimal decision vector

must reside. Consider the problem

(1)

Formulation (1) can be transformed into an unconstrained optimization problem by “dualization”, i.e., introducing a regularization parameter and a penalty function to address feasibility. For IPMs, this is achieved with a barrier function, , that satisfies two properties: (i) when ; and (ii) when . The resulting unconstrained optimization problem is

Given a differentiable barrier function and an initial solution , IPMs use the Newton method to iterate over and , until convergence to an optimal solution [Boyd and Vandenberghe, 2004]. These methods have found most success in linear and quadratic optimization, where it is possible to give theoretical guarantees on optimality, as well as fast empirical convergence rates [Gondzio, 2012]. For arbitrary convex feasible sets, Nemirovski and Nesterov [1994] introduced the concept of universal self-concordant barriers, a family of functions with properties that guarantee convergence. The first explicit construction of such functions were only recently proposed  [Bubeck and Eldan, 2014] and this has led to renewed interest in IPMs for more challenging convex optimization problems (e.g., Abernethy and Hazan, 2015, Karimi et al., 2017). We note that although the extant literature has primarily focused on problems with convex feasible sets, IPMs have also been adapted for problems with non-convex feasible sets  [Vanderbei and Shanno, 1999, Benson et al., 2004, Hinder and Ye, 2018]. In this case, IPMs are more difficult to implement. Efficiency guarantees may not exist, and if they do, the functions must obey strict requirements (e.g., differentiability).

These approaches require that the constraints and objective are known to the optimizer a priori and characterized by differentiable functions. In this paper, we extend the literature on IPMs by proposing a methodology that does not require an explicit description of the set of constraints or require the constraint to obey any particular properties. Instead, we define the feasible region using only a set of observed data points representing observed decisions by a user or set of users.

2.2 Generative Adversarial Networks

GANs and their variants have revolutionized generative modeling for a variety of applications [Goodfellow et al., 2014, Taigman et al., 2016, Wu et al., 2016, Isola et al., 2017]. Let denote samples from a data distribution and denote a latent distribution. In a GAN, a generator and a discriminator compete in a min-max game, where the generator learns to generate samples

and the discriminator learns to classify them as belonging to

or not (see  Goodfellow [2016]

for more details). The loss function for this game is given below:

(2)

A GAN that is trained to global optimality possesses the following properties. [Goodfellow et al. [2014]] Consider a GAN of sufficient capacity trained on a data distribution . Let denote the generator output distribution.

  1. For any fixed generator , the optimal discriminator is

  2. For an optimal generator , the optimal discriminator is

While Lemma 2.2 provides powerful theoretical guarantees for performance, several recent results demonstrate that global convergence of both generator and discriminator is usually not attainable [Li et al., 2017, Heusel et al., 2017, Nagarajan and Kolter, 2017, Mescheder et al., 2017, Arjovsky and Bottou, 2017]. In particular, Li et al. [2017] prove that there are trade-offs between generator and discriminator updates. Moreover, Arjovsky and Bottou [2017] show that while an optimal discriminator can be achieved, an optimal generator usually cannot. Fortunately, this inherent weakness of GANs is less concerning in our setting. While an optimal generator/discriminator pair is ideal, our main result below only requires that the discriminator be optimal for any arbitrary generator.

3 An Interior Point Algorithm using Adversarial Networks

We propose a two-stage method for solving an arbitrary constrained optimization problem by combining IPMs with deep learning. We train a GAN and use the discriminator as a barrier and the generator to provide an initial set. In Section 3.1, we prove the effectiveness of the discriminator as a barrier. In Section 3.2, we re-purpose the generator to converge to the optimal set for a desired objective function. The pseudo-code for the complete IPMAN algorithm is given in Algorithm 1.

3.1 The discriminator as the barrier

Consider the constrained optimization problem (1) with an objective function that we assume, for simplicity, is bounded below. We do not explicitly define the feasible set and make no assumptions on structure. Let denote an optimal solution and denote the set of optimal solutions. Further, let denote an arbitrary “feasible distribution”, i.e., any distribution whose support is . In the first stage, we train a GAN using a dataset of samples .

Consider a GAN trained on . For any fixed generator , let be the optimal discriminator. Then, for any , there exist constants , such that

(3)

is bounded and feasible. Moreover, an optimal solution to (3) is -optimal for (1), satisfying

Proof.

We first prove that (3) is bounded and feasible. Note that, by assumption, is bounded below. Further, by definition, there exists an optimal solution to (1). From Lemma 2.2, the optimal discriminator satisfies for all , confirming that (3) is feasible and bounded.

We next prove . If is an optimal solution to (3), its existence implies that is bounded, i.e., . Thus, must be feasible for (1) which confirms the inequality.

Finally, we show that . Since is an optimal solution to (1), and . Thus, it is a feasible solution to (3). Since is the optimal solution to (3),

Define and . Then, and we have , proving that problem (3) is a lower bound on (1). Thus, for any , we define which confirms the inequality, and also, -optimality. ∎

For classical IPMs, -optimality is proved by showing that the optimal solution to (3) satisfies the KKT conditions [Boyd and Vandenberghe, 2004]. However, in our data-driven setting, the feasible set is not explicitly defined by differentiable constraints and the standard approach (i.e., using the KKT conditions) cannot be applied. This necessitates the introduction of the additional parameter. Overall, the term results in a less “elegant” result, in exchange for generality. Fortunately, in practice, the extra parameter does not significantly affect implementation, as we describe later below.

Theorem 3.1 proves that the barrier function is capable of guaranteeing -optimality for any generator. That is, if the discriminator is trained to optimality, an explicit performance bound can be specified. In this context, the generator constructs an approximation of the feasible set and its purpose is analogous to the problem of generating a starting solution in an IPM. This approach sidesteps many of the difficulties of training a GAN to global optimality, as in practice, we need only train to an optimal discriminator and a sufficiently capable generator. Nevertheless, -optimality is guaranteed only for specific choices of and , i.e., those that satisfy the requirements in the proof. Training a generator to optimality relaxes these necessary conditions. Let be the globally optimal generator/discriminator pair for a GAN trained on . For any and , (3) and (1) have the same optimal solutions.

Proof.

From Theorem 3.1, we know that both problems are bounded and feasible. Using Lemma 2.2, we rewrite the objective function to (3):

For any fixed , the latter term is just a constant. ∎

Several prior results (e.g., Li et al., 2017, Arjovsky and Bottou, 2017) demonstrate that, except under very specific conditions, an optimal generator/discriminator pair is generally not achievable. Thus, in practice, we must rely on the results from Theorem 3.1. This introduces two issues that must be addressed to ensure -optimality is preserved. First, must be carefully chosen. Second, for to be correctly specified, knowledge of the optimal solutions and are required.

To overcome these obstacles, note that training is an iterative procedure where and

can be viewed as hyperparameters. That is,

and are chosen first and then held fixed while the GAN is trained. After training concludes, and are updated and the GAN is trained again. This iterative procedure continues until a sufficiently small objective function value is obtained. However, notice that during the training phase, the term is constant. Thus, it can be removed from the loss function without loss of generality (i.e., set ). Consequently, our iterative procedure (omitting ) finds the optimal discriminator and the regularization parameter that minimizes (3). Then, to guarantee -optimality, we choose where is a sufficiently small constant.

3.2 Generating the optimal set

Given a GAN trained on , the discriminator is a barrier function and the generator produces solutions from a distribution whose support is approximately . We now demonstrate how the generator can be used to learn . Consider a GAN trained on samples from until we reach an optimal discriminator. Suppose that we then freeze the discriminator weights and train the generator over the loss function:

(4)

There exists an optimal generating distribution whose support is .

Proof.

For a generator with sufficient capacity, problem (4) is equivalent to

For any , there exists an optimal generating distribution with mass only on . This is observed by noting that is a global optimum and that the minimum of a set of values (in this case, objective function values) is smaller than or equal to the mean. Consider any distribution whose support is . By the same argument, the mean of the set equals the minimum because each point in attains the minimum. ∎

Classical methods for solving constrained optimization problems find a single optimal solution, although many may exist. Determining the complete optimal set is typically a difficult problem that relies on uses on enumerative or intelligent search techniques (e.g., Cornuejols and Trick, 1998, Tantawy, 2007, Guenther et al., 2014). However, by leveraging a GAN, we can use the generator to learn the distribution of the optimal set. Note that our approach provides no guarantee that the complete optimal set can be learned. However, any generating distribution supported on a subset of , by definition, is also an optimal solution of (2). Further, our numerical experiments suggest that we often converge to the full optimal set or a sufficiently large subset.

Data set , dual parameter , growth rate , counter
procedure Stage 1: boundary and initial feasible set
     Train GAN with to a good generator and optimal discriminator .
     return and .
end procedure
procedure Stage 2: optimal set
     while  has not converged do
         
         
         
     end while
     return .
end procedure
Algorithm 1 IPMAN

4 Numerical results

In this section, we demonstrate how to apply our methodology to compute the optimal solution set for two sets of examples. First, to visualize how the IPMAN algorithm performs, we solve a synthetic two-dimensional optimization problem with a non-convex feasible set using several linear and nonlinear objective functions (see Section 4.1). Then, in Section 4.2, we explore how the IPMAN algorithm can be applied to a realistic non-convex optimization problem associated with radiation therapy treatment optimization. Code for all experiments is provided at https://github.com/rafidrm/ipman.

4.1 Synthetic examples

(a) Linear
(b) Quadratic
(c) Bilinear
(d) Rosenbrock
Figure 1:

Output of IPMAN over a non-convex feasible set for various different objectives. The beige dots represent feasible solutions. The blue dots represent realizations from the final generator distribution, i.e., the optimal set. To improve visibility, we removed outliers beyond the

percentile.

We trained a basic GAN with one hidden layer and leaky ReLU for both generator and discriminator networks 

[Greydanus, 2017] to learn the following L-shaped feasible set :

The shape of was chosen because it was non-convex, easy to visualize, and optimal solutions could be verified analytically using the KKT conditions.

To obtain an optimal discriminator, we applied several modifications to the training procedure [Chintala et al., 2016]. First, to generate , we sampled uniformly within a slightly smaller subset of and added Gaussian noise to smooth the distribution at the boundary; this helped stabilize training. Second, we used a dataset of i.i.d. samples of infeasible solutions (). In later iterations of stage 1, we periodically replaced generator samples with the infeasible samples in order to better update the discriminator. Finally, we updated the discriminator ten times more frequently compared to the generator. All models were trained using the Adam optimizer [Kingma and Ba, 2014].

After training the discriminator, we minimized several linear and and nonlinear objective functions over this non-convex feasible region by learning the optimal solution set. We chose and for the first three problems and and for the last. Finally, we generated samples for each problem and removed outliers beyond the percentile.

We evaluated each model based on three different measures. We first considered the absolute objective function value error , where is the known optimal solution. This error is the empirical analogue of the -optimality guarantee. We also measured the Value-at-Risk (VaR) at the percentile; just as calculates the mean, VaR measures the worst generated error. Finally, because the functions grow at different rates, we also calculated the average distance to the optimal set . The final generator distributions are displayed in Figure 1. We present the scores in Table 1 and summarize the results below:

  • Linear : Due to the smoothness of , the discriminator penalizes solutions where . As a result, the final distribution is cut off near the boundary. Nonetheless, the distribution converges to within with a few outliers exceeding .

  • Quadratic : The generator distribution quickly converged to the optimal solution, as the the optimal set is a singleton in the feasible set.

  • Bilinear : The objective is non-convex with two optimal solutions at opposite ends of the feasible set. Although is disproportionately high due to quadratic growth from the optimal solution value, the value of suggests that the generated samples are very close to the optimal solutions.

  • Rosenbrock : This is a standard test for non-convex optimization algorithms [Yang, 2010]. The function has a large easy-to-learn valley (there are many local minima) with a hard-to-find global minimum at . We quickly find the valley and slowly converge to the optimal solution.

Objective Optimal solutions VaR
Linear
Quadratic
Bilinear
Rosenbrock
Table 1: IPMAN performance over four synthetic examples.

In all cases, we observe that our method converges to the optimal solution, and when the are multiple global optima, our approach quickly produces many solutions in the optimal set. These examples demonstrate that the IPMAN algorithm is not only theoretically viable, but performs well empirically.

4.2 Radiation therapy treatment planning

In this section, we apply our methodology to the problem of generating RT treatment plans for prostate cancer. Given the computed tomography (CT) images and treatment specifications for a patient [Breedveld and Heijmen, 2017], we used the IPMAN algorithm to generate an optimal dose distribution . The objective penalizes excess dose to healthy tissue and insufficient dose to the target structure. Formally, the optimization problem is

(5a)
(5b)

where is the excess dose penalty and is the prescribed dose to the target. Constraints (1) and (2) are non-convex Value-at-Risk constraints whereas (3) and (4) can be modeled as linear constraints.

We used two methods to generate the training data. To generate feasible solutions in , we solved convex approximations of problem (5). To better train the discriminator in later iterations, we also generated infeasible solutions. This constituted the training samples used in stage 1 of IPMAN.

We implemented a 3D variant of the Style Transfer GAN [Isola et al., 2017] by modifying the U-net architecture for voxels [Wu et al., 2016]. The network learns a mapping from 3D CT images to 3D dose distributions. In stage 1, we trained for iterations using the same optimizer settings as in Isola et al. [2017]. In later iterations, we periodically replaced generator samples with samples from the infeasible dataset for the discriminator. In stage 2, we froze the discriminator and re-trained the generator for the objective function in (5) plus a barrier. We set , and trained for iterations. As we optimized for the same patient, we used the same CT images with added noise.

Constraint Description DELIVERABLE STAGE-ONE IPMAN
(1) VaR of dose to tumor
(2) VaR of dose to tumor
(3) Max of dose to urethra
(4) Max of dose to bladder
Average objective function value 1.00 3.53 0.12
Table 2: Clinical criteria to evaluate plan acceptability that we used as hard constraints in the treatment optimization model. Note that is defined as the target dose prescribed to the tumor.

In stage 1, we focused on ensuring that the GAN learned the non-convex VaR constraints at the cost of failing the convex constraints to the urethra and bladder. Using the objective function to penalize non-target coverage, we guided the GAN in stage 2 to correct the infeasibility in the urethra and bladder constraints while using the barrier function to ensure that the non-convex VaR constraints remained satisfied. After the second stage, we generated ten predictions (IPMAN) and compared them to all of the deliverable plans (DELIVERABLE) and the sample predictions after stage 1 only (STAGE-ONE). The results are presented in Table 2. Observe that the final generated dose distributions from the IPMAN algorithm satisfy all of the convex and non-convex criteria. Moreover, the objective function values significantly improve over the deliverable plans.

One drawback to the IPMAN approach is that areas of low dose, i.e., structures far away from the target, are smoothed (see Figure 2). This is because the IPMAN algorithm must learn to navigate a complex feasible region to obtain a globally optimal solution, i.e., the precise dose distribution to the target structure. This precision comes at a cost; information on dose delivery far from the target structure is lost. Nevertheless, the high dose region where IPMAN performs well is far more important to predict accurately, as this represents the intervention that treats the cancer.

Figure 2: A sample of CT slices from the test patient. From top to bottom: contoured CT image (generator input), DELIVERABLE plan, STAGE-ONE prediction, and the IPMAN prediction.

5 Conclusion

We present a new methodology for solving constrained optimization problems that combines generative adversarial learning and interior point methods. Our approach extends previous IPM results to situations where the feasible set is non-convex or not even explicitly defined. Our proposed IPMAN algorithm achieves -optimality with respect to the original problem. Moreover, when there are multiple global optima, our algorithm learns a distribution over the optimal set. We demonstrate the effectiveness of our framework by applying it to a problem in radiation therapy treatment optimization where the feasible set is non-convex.

This work represents the first attempt at using generative adversarial networks to learn optimality criteria for constrained optimization problems. While there are many advantages over classical methods (e.g., data-driven, addresses non-convexity) there are several new challenges. First, our methods bypass the well-known weaknesses of using GANs by requiring only that the discriminator be optimal for any generator. While this is achievable in theory, obtaining a highly accurate discriminator requires significant fine tuning. One method we found successful was to augment training with infeasible points. Second, the contribution of the barrier function to the objective must be carefully managed by determining a satisfactory value for the dual parameter. Nevertheless, the positive results on a difficult optimization problem (i.e., RT treatment optimization) suggest that generatively learning optimality is a viable approach that warrants further investigation.

Acknowledgments

Support for this research was provided by the Natural Sciences and Engineering Research Council of Canada.

References