A General Large Neighborhood Search Framework for Solving Integer Programs

03/29/2020 ∙ by Jialin Song, et al. ∙ 4

This paper studies how to design abstractions of large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways, and that are amenable to data-driven design. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic approaches and their software implementations. We also show that one can learn a good neighborhood selector from training data. Through an extensive empirical validation, we demonstrate that our LNS framework can significantly outperform, in wall-clock time, compared to state-of-the-art commercial solvers such as Gurobi.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The design of algorithms for solving hard combinatorial optimization problems remains a valuable and challenging task. Practically relevant problems are typically NP-complete or NP-hard. Examples include any kind of search problem through a combinatorial space, such as inference in graphical models (Wainwright et al., 2005), planning (Ono and Williams, 2008), mechanism design (De Vries and Vohra, 2003), program synthesis (Manna and Waldinger, 1971), verification (Bérard et al., 2013), and engineering design (Cui et al., 2006, Mirhoseini et al., 2017), amongst many others.

The widespread importance of solving these hard combinatorial optimization problems has spurred intense research in designing approximation algorithms and heuristics for large classes of combinatorial optimization settings, such as integer programming (Berthold, 2006, Fischetti and Lodi, 2010, Land and Doig, 2010) and satisfiability (Zhang and Malik, 2002, De Moura and Bjørner, 2008, Dilkina et al., 2009a). Historically, the design of such algorithms was done largely manually, requiring careful understandings of the underlying structure within specific classes of optimization problems. Such approaches are often unappealing due to the need to obtain substantial domain knowledge, and one often desires a more automated approach.

In recent years, there has been an increasing interest to automatically learn good (parameters of) algorithms from training data. The most popular paradigm, also referred to as “learning to search”, aims to learn good local decisions within a search procedure such as branch-and-bound (He et al., 2014, Khalil et al., 2016, 2017a, Song et al., 2018, 2019, Gasse et al., 2019)

. While this line of research has shown promise, it falls short of delivering practical impact, especially in improving wall-clock time. A major reason is that most algorithms are implemented on open-sourced solvers such as SCIP, which, according to recent benchmark results

(Mittelmann, 2017, Optimization, 2019), is considerably slower than leading commercial solvers such as Gurobi and CPlex (usually by a factor of 10 or more). Such learning to search approaches also ignore the many other heuristics typically employed by commercial solvers, such as primal pre-solve heuristics (Achterberg et al., 2019).

Motivated by the aforementioned drawback, in this paper, we study how to design abstractions of large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers as a generic black-box subroutine. Our goal is to arrive at new approaches that can reliably outperform leading commercial solvers in wall-clock time. We are further interested in designing frameworks that are amenable to data-driven methods. We ground our work in two ways. First, we study how to solve integer programs (IPs), which are a common way to represent many combinatorial optimization problems. Second, we leverage the large neighorhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. A major appeal of LNS is that it can easily use any existing solver as a subroutine. We are furthermore interested in designing a framework that does not require incorporating extensive domain knowledge in order to apply to various problem domains, e.g., by learning data-driven decision procedures for the framework.

Our contributions can be summarized as:

  • We propose a general LNS framework for solving large-scale IPs. Our framework does not depend on incorporating domain knowledge in order to achieve strong performance. In our experiments, we combine our framework with Gurobi, which is a leading commercial IP solver.

  • We show that, perhaps surprisingly, even using a random decision procedure within our LNS framework finds significantly outperforms Gurobi on many problem instances.

  • We develop a learning-based approach that predicts a partitioning of the variables of an IP, which then serves as a learned decision procedure within our LNS framework. In a sense, this data-driven procedure is effectively learning how to decompose the original optimization problem into a series of smaller sub-problems that can be solved much more efficiently using existing solvers.

  • We perform an extensive empirical validation across several IP benchmarks, and demonstrate superior wall-clock performance compared to Gurobi across all benchmarks. These results suggest that our LNS framework can effectively leverage leading state-of-the-art solvers to reliably achieve substantial speed-ups in wall-clock time.

2 Related Work on Learning to Optimize

An increasingly popular paradigm for the automated design and tuning of solvers is to use data-driven or learning-based approaches. Broadly speaking, one can categorize most existing “learning to optimize” approaches into three categories: (1) learning search heuristics such as for branch-and-bound; (2) tuning the hyperparameters of existing algorithms; and (3) learning to identify key substructures that an existing solver can exploit, such as backdoor variables. In this section, we survey these three paradigms.

2.1 Learning to Search

In learning to search, one typically operates within the framework of a search heuristic, and trains a local decision policy from training data. Perhaps the most popular search framework for integer programs is branch-and-bound (Land and Doig, 2010), which is a complete algorithm for solving integer programs (IPs) to optimality. Branch-and-bound is a general framework that includes many decision points that guide the search process, which historically have been designed using carefully attained domain knowledge. To arrive at more automated approaches, a collection of recent works explore learning data-driven models to outperform manually designed heuristics, including learning for branching variable selection (Khalil et al., 2016, Gasse et al., 2019), or node selection (He et al., 2014, Song et al., 2018, 2019). Moreover, one can also train a model to decide when to run primal heuristics endowed in many IP solvers (Khalil et al., 2017a)

. Many of these approaches are trained as policies using reinforcement or imitation learning.

Writing highly optimized software implementations is challenging, and so all previous work on learning to branch-and-bound were implemented within existing software frameworks that admit interfaces for custom functions. The most common choice is the open-source solver SCIP (Achterberg, 2009), while some previous work relied on callback methods with CPlex (Bliek1ú et al., 2014, Khalil et al., 2016). However, in general, one cannot depend on highly optimized solvers being amenable to incorporating learned decision procedures as subroutines. For instance, Gurobi, the leading commercial IP solver according to (Mittelmann, 2017, Optimization, 2019), has very limited interface capabilities, and to date, none of the learned branch-and-bound implementations can reliably outperform Gurobi.

Beyond branch-and-bound, other search frameworks that are amenable to data-driven design include A* search (Song et al., 2018), direct forward search (Khalil et al., 2017b), and sampling-based planning (Chen et al., 2020). These settings are less directly relevant, since our work is grounded in solving IPs. However, the LNS framework can, in principle, be interpreted more generally to include these other settings as well, which is an interesting direction for future work.

2.2 Algorithm Configuration

Another area of using learning to speed up optimization solvers is algorithm configuration (Hoos, 2011, Hutter et al., 2011, Ansótegui et al., 2015, Balcan et al., 2018, Kleinberg et al., 2019). Existing solvers tend to have many customizable hyperparameters whose values strongly influence the solver behaviors. Algorithm configuration aims to optimize those parameters on a problem-by-problem basis to speed up the solver.

Similar to our approach, algorithm configuration approaches leverage existing solvers. One key conceptual difference is that algorithm configuration does not yield fundamentally new approaches, but rather is a process for tuning the hyperparameters of an existing approach. As a consequence, one limitation of algorithm configuration approaches is that they rely on the underlying solver being able to solve problem instances in a reasonable amount of time, which may not be possible for hard problem instances. Our LNS framework can thus be viewed as a complementary paradigm for leveraging existing solvers. In fact, in our experiments, we perform a simple version of algorithm configuration. We defer incorporating more complex algorithm configuration procedures as future work.

2.3 Learning to Identify Substructures

The third category of approaches is learning to predict key substructures of an optimization problem. A canonical example is learning to predict backdoor variables (Dilkina et al., 2009b), which are a set of variables that, once instantiated, the remaining problem simplifies to a tractable form (Dilkina et al., 2009a). Our approach bears some high-level affinity to this paradigm, as we effectively aim to learn decompositions of the original problem into a series of smaller subproblems. However, our approach makes a much weaker structural assumption, and thus can more readily leverage a broader suite of existing solvers. Other examples of this general paradigm include learning to pre-condition solvers, such as generating an initial solution to be refined with a downstream solver, which is typically more popular in continuous optimization settings (Kim et al., 2018).

3 A General Large Neighborhood Search Framework for Integer Programs

We now present our large neighborhood search (LNS) framework for solving integer programs (IPs). LNS is a metaheuristic that generalizes the neighborhood search for optimization which iteratively improves an existing solution by local search. As a concept, LNS has been studied for over two decades (Shaw, 1998, Ahuja et al., 2002, Pisinger and Ropke, 2010). However, previous work studied specialized settings with domain-specific decision procedures. For example, in Shaw (1998), the definition of neighborhoods is highly specific to the vehicle routing problem, so the decision making of how to navigate the neighborhood is also domain-specific. We instead aim to develop a general framework that avoids requiring domain-specific structures, and whose decision procedures can be designed in a generic and automated way, e.g., via learning as described in Section 4. In particular, our approach can be viewed as a decomposition-based LNS framework that operates on generic IP representations, as described in Section 3.2.

3.1 Background

Formally, let be the set of all variables in an optimization problem and be all possible value assignments of . For a current solution , a neighborhood function is a collection of candidate solutions to replace , afterwards a solver subroutine is evoked to find the optimal solution within . Traditional neighborhood search approaches define explicitly, e.g., the 2-opt operation in the traveling salesman problem (Dorigo et al., 2006) and its extension of -opt operation (Helsgaun, 2009). LNS defines implicitly through a destroy and a repair method. A destroy method destructs part of the current solution while a repair method rebuilds the destroyed solution. The number of candidate repairments is potentially exponential in the size of the neighborhood, which explains the “large“ in LNS.

In the context of solving IPs, the LNS is also used as a primal heuristics for finding high quality incumbent solutions (Rothberg, 2007, Helber and Sahling, 2010, Hendel, 2018). The ways large neighborhoods are constructed are random (Rothberg, 2007), manually defined (Helber and Sahling, 2010) and bandit algorithm selection from a pre-defined set (Hendel, 2018). Furthermore, because of the level of decision-making, these LNS approaches often require interface access to the underlying solver, which is often undesirable when designing frameworks that offer ease of deployment.

Recently, there has been some work on using learning within LNS (Hottung and Tierney, 2019, Syed et al., 2019). These approaches are designed for specific optimization problems, such as capacitated vehicle routing, and so are not directly comparable with our generic approach for solving IPs. Furthermore, they often focus on learning the underlying solver (rather than rely on existing state-of-the-art solvers), which makes them unappealing from a deployment perspective.

3.2 Decomposition-based Large Neighborhood Search for Integer Programs

We now describe the details of our LNS framework. At a high level, our LNS framework operates on an integer program (IP) via defining decompositions of its integer variables into disjoint subsets. Afterwards, we can select a subset and use an existing solver to optimize the variables in that subset while holding all other variables fixed. The benefit of this framework is that it is completely generic to any IP instantiation of any combinatorial optimization problem.

Throughout this paper, we consider minimization of the objective value for all the problems. We first describe a version of LNS for integer programs based on decompositions of their integer variables which is a modified version of the evolutionary approach proposed in Rothberg (2007). The algorithm is outlined in Alg 1.

For an integer program with a set of integer variables (not necessarily all the integer variables), we define a decomposition of the set as a disjoint union . Assume we have an existing feasible solution to , we view each subset of integer variables as a local neighborhood for search. We fix integers in with their values in the current solution and optimize for variable in (referred as the FIXANDOPTIMIZE function in Line 3 of Alg 1). As the resulting optimization is a smaller IP, we can use any off-the-shelf IP solver to carry out the local search. In our experiments, we use Gurobi to optimize the sub-IP. A new solution is obtained and we repeat the process with the remaining subsets.

1:  Input: an optimization problem , an initial solutions , a decomposition , a solver
2:  for  do
3:     
4:  end for
5:  return  
Algorithm 1 Decomposition-based LNS

Decomposition Decision Procedures. Notice that a different decomposition defines a different series of LNS problems and the effectiveness of our approach proceeds with a different decomposition for each iteration. The simplest implementation is to use a random decomposition approach, which we show empirically already delivers very strong performance. We can also consider learning-based approaches that learn a decomposition from training data, discussed further in Section 4.

4 Learning a Decomposition

In this study, we apply data-driven methods, such as reinforcement learning and imitation learning, to learn policies to generate decompositions for the LNS framework described in Section

3.2

. We specialize a Markov decision process for our setting. For a combinatorial optimization problem instance

with a set of integer variables , a state

is a vector representing an assignment for variables in

, i.e., it is an incumbent solution. An action at a state is a decomposition of as described in Section 3.2. After running LNS through neighborhoods defined in , we obtain a (new) solution . The reward where is the objective value of when is the solution. We restrict ourselves to finite-horizon task of length so we can set the discount factor to be 1.

4.1 Reinforcement Learning

For reinforcement learning, for simplicity, we choose to use REINFORCE (Sutton et al., 2000) which is a classical Monte-Carlo policy gradient method for optimizing policies. To goal is to find a policy that maximizes , the expected discounted accumulative reward. The policy is normally parameterized with some . Policy gradient methods seek to optimize by updating in the direction of

By sampling trajectories

, one can estimate the gradient

.

4.2 Imitation Learning

1:  Input: a collection of optimization problems with initial solutions , the time horizon, the number of random decompositions to sample, the number of subsets in a decompositon, a solver.
2:  for  do
3:     
4:     
5:     for  do
6:         
7:         for  do
8:            
9:            
10:            Decomposition-based LNS
11:            
12:         end for
13:         if  then
14:            
15:            
16:         end if
17:     end for
18:     Record for
19:  end for
20:  return  
Algorithm 2 COLLECTDEMOS

In imitation learning, demonstrations (from an expert) serves as the learning signals. However, we do not have the access to an expert to generate good decompositions. To overcome this issue, we generate demonstrations by sampling random decompositions and take the ones resulting in best objectives as demonstrations. This procedure is shown in Alg 2. The core of the algorithm is shown on Lines 7-12 where we repeatedly sample random decompositions and call the Decomposition-based LNS algorithm (Alg 1) to evaluate them. In the end, we record the decompositions with the best objective values (Lines 13-16).

Once we have generated a collection of good decompositions , we apply two imitation learning algorithms. The first one is behavior cloning (Pomerleau, 1989). The main idea is to turn each demonstration trajectory into a collection of state-action pairs

, then treat policy learning as a supervised learning problem. In our case, the action

is a decomposition which we represent as a vector. Each element of the vector has a label about which subset it belongs to. Thus, we reduce the learning problem to a supervised classification task.

Behavior cloning suffers from cascading errors (Ross and Bagnell, 2010). We use the forward training algorithm (Ross and Bagnell, 2010) to correct mistakes made at each step. We adapt the forward training algorithm for our use case and present it in Alg 3 that uses Alg 2 as a subroutine. The main difference with behavior cloning is the adaptive demonstration collection step on Line 4. In this case, we do not collect all demonstrations beforehand, instead, they are collected dependent on the predicted decompositions of previous policies.

1:  Input: a collection of optimization problems with initial solutions , the time horizon, the number of random decompositions to sample, the number of subsets in a decompositon, a solver..
2:  for  do
3:     
4:     
5:     
6:     for  do
7:         
8:         
9:     end for
10:  end for
11:  return  
Algorithm 3 Forward Training for LNS

4.3 Featurization of an Optimization Problem

For training models, it is necessary to define features that contain enough information for learning. In the following paragraphs, we describe the featurization of two classes of combinatorial optimization problems.

Combinatorial Optimization over Graphs.

The first class of problems are defined explicitly over graphs as those considered in Khalil et al. (2017b). Examples include the minimum vertex cover, the maximum cut and the traveling salesman problems. The (weighted) adjacency matrix of the graph contains all the information to define the optimization problem so we use it as the feature input to a learning model. Notice that for such optimization problems, each vertex in the graph is often associated with an integer variable in its IP formulation.

General Integer Programs.

There are other classes of combinatorial optimization problems that do not originate from explicit graphs. Nevertheless, they can be modeled as integer programs. We construct the following incidence matrix between the integer variables and the constraints. For each integer variable and a constraint , where is the coefficient of the variable in the constraint if it appears in it and 0 otherwise.

Incorporating Current Solution.

As outlined in Section 3.2, we seek to adaptively generate decompositions based on the current solution. Thus we need to include the solution in the featurization. Regardless of which featurization we use, the feature matrix has the same number of rows as the number of integer variables we consider. As a result, we can simply include the variable value in the solution as an additional feature.

5 Emprical Validation

We present experimental results on four diverse applications covering both combinatorial optimization over graphs and general IPs. We discuss the design choices of crucial parameters in Section 5.1, and present the main results in Sections 5.2 & 5.3. Finally, we inspect visualizations to interpret predicted decompositions in Section 5.4, and discuss some limitations in Section 5.5.

5.1 Datasets & Setup

Datasets. We evaluate on 4 NP-hard benchmark problems expressed as IPs. The fist two, minimum vertex cover (MVC) and maximum cut (MAXCUT), are graph optimization problems. For each problem, we consider two random graph distributions, the Erdős-Rényi (ER) (Erdős and Rényi, 1960) and the Barabási-Albert (BA) (Albert and Barabási, 2002) random graph models. For MVC, we use graphs of size 1000. For MAXCUT, we use graphs of size 500. All the graphs are weighted and each vertex/edge weight is sampled uniformly from [0, 1] for MVC and MAXCUT, respectively. We also apply our method to combinatorial auctions (Leyton-Brown et al., 2000) and risk-aware path planning (Ono and Williams, 2008), which are not based on graphs. We use the Combinatorial Auction Test Suite (CATS) (Leyton-Brown et al., 2000) to generate auction instances from two distributions: regions and arbitrary. For each distribution, we consider two sizes: 2000 items with 4000 bids and 4000 items with 8000 bids. For the risk-aware path planning experiment, we use a custom generator to generate obstacle maps with 30 obstacles and 40 obstacles.

Learning a Decomposition. When learning the decomposition procedure we use 100 instances for training, 10 for validation and 50 for testing. When using reinforcement learning, we sample 5 trajectories for each problem to estimate the policy gradient. For imitation learning based algorithms, we sample 5 random decompositions and use the best one as demonstrations. All our experiment results are averaged over 5 random seeds.

1 2 3
2
3
4
5
Table 1: Parameter sweep results for of an MVC dataset for Erdős-Rényi random graphs with 1000 vertices. Numbers represent improvement ratios for one decomposition, averaged over 5 random seeds.

Initialization. To run large neighborhood search, we require an initial feasible solution (typically quite far from optimal). For MVC, MAXCUT and CATS, we initialize a feasible solution by including all vertices in the cover set, assigning all vertices in one set and accepting no bids, respectively. For risk-aware path planning, we initialize a feasible solution by running Gurobi for 3 seconds. This time is included when we compare wall-clock time with Gurobi.

Hyperparameter Configuration. We must set two parameters in order to run the our LNS approach. The first one is , the number of equally sized subsets to divide variables into. The second is how long we run the solver on each sub-problem. Each sub-IP could still be fairly large so solving them to optimality can take a long time, so we impose a time limit. We run a parameter sweep over the number of decompositions from 2 to 5 and time limit for sub-IP from 1 second to 3 seconds. For each configuration of , the wall-clock time for one iteration of LNS will be different. For a fair comparison, we use the ratio where is the objective value improvement and is the time spent as the selection criterion for the optimal configuration. Table 1 contains the result for one MVC dataset. As shown, the configuration makes a big difference on the performance. For this case, is the best setting and we will use them for our experiments on this particular dataset. We perform the procedure for every dataset. See Appendix for similar tables for other datasets.

5.2 Benchmark Comparisons with Gurobi

We now present our main benchmark evaluations. We instantiate our framework in four ways:

  • Random-LNS: using random decompositions

  • BC-LNS: using a decomposition policy trained using behavior cloning

  • FT-LNS: using a decomposition policy trained using forward training

  • RL-LNS: using a decomposition policy trained using REINFORCE

We use Gurobi 9.0 as the underlying solver. For learned LNS methods, we generate 10 decompositions in sequence by the model and apply LNS with these decompositions. We use the same time limit setting for running each sub-problem, as a result, the wall-clock among decomposition methods are very close.

When comparing using just Gurobi, we limit Gurobi’s runtime to be the longest runtime across all instances from our LNS methods. In other words, Gurobi’s runtime is longer than all the decompostion based methods, which gives more time to find the best solution possible.

Main Results. Tables 2, 3 and 4 show the results across the different benchmarks. We make two observations:

  • All LNS variants significantly outperform Gurobi (up to improvement in objectives), given the same amount or less wall-clock time. Perhaps surprisingly, this phenomenon holds true even for Random-LNS.

  • The imitation learning based variants, FT-LNS and BC-LNS, outperform Random-LNS and RL-LNS in most cases.

Overall, these results suggest that our LNS approach can reliably offer substantial improvements over state-of-the-art solvers such as Gurobi. These results also suggest that one can use learning to automatically design strong decomposition approaches, and we provide a preliminary qualitative study of what the policy has learned in Section 5.4. It is possible that a more sophisticated RL method could further improve RL-LNS.

(a) Combinatorial auction of 2000 items and 4000 bids from the regions distribution.
(b) Combinatorial auction of 2000 items and 4000 bids from the arbitrary distribution.
(c) Risk-aware path planning for navigating through 30 obstacles.
Figure 1: Improvements of objective values as more iterations of LNS are applied. In all three cases, imitation learning methods, BC-LNS and FT-LNS, outperform the Random-LNS.
(a) Combinatorial auction of 2000 items and 4000 bids from regions distribution.
(b) Combinatorial auction of 2000 items and 4000 bids from arbitrary distribution.
(c) Risk-aware path planning for navigating through 30 obstacles.
(d) Maximum cut over a Barabási-Albert random graph with 500 vertices.
Figure 2: We compare LNS methods on how the objective values improve as more wall-clock time is spent for some representative problem instances. We also include Gurobi in the comparison. All LNS methods find better solutions than Gurobi early on and the advantage is maintained even though Gurobi is given more time for each instance. In Fig 1(d), after running for 2 hours, Gurobi is unable to match the quality of solution found by Random-LNS in 5 seconds.

Per-Iteration Comparison. We use a total of 10 iterations of LNS, and it is natural to ask how the solution quality changes after each iteration. Fig 1 shows objective value progressions of variants of our LNS approach on three datasets. For the two combinatorial auction datasets, BC-LNS and FT-LNS achieve substantial performance gains over Random-LNS after just 2 iterations of LNS, while it takes about 4 for the risk-aware path planning setting. These results show that learning a decomposition method for LNS can establish early advantages over using random decompositions.

Running Time Comparison. Our primary benchmark comparison limited all methods to roughly the same time limit. We now investigate how the objective values improve over time. Figure 2 shows four representative instances. We see that BC-LNS achieves the best performance profile of solution quality vs. wall-clock.

How Long Does Gurobi Need? Figure 2 also allows us to compare with the performance profile of Gurobi In all cases, LNS methods find better objective values than Gurobi early on and maintain this advantage even as Gurobi spends significantly more time. Most notably, in Figure 1(d), Gurobi was given 2 hours of wall-clock time, and failed to match the solution found by Random-LNS in just under 5 seconds (the time axis is in log scale).

MVC BA 1000 MVC ER 1000 MAXCUT BA 500 MAXCUT ER 500
Gurobi
Random-LNS
BC-LNS
FT-LNS
RL-LNS
Table 2: Comparison of different LNS methods and Gurobi for MVC and MAXCUT problems with different random graph generators.
CATS Regions 2000 CATS Regions 4000 CATS Arbitrary 2000 CATS Arbitrary 4000
Gurobi
Random-LNS
BC-LNS
FT-LNS
Table 3: Comparison of different LNS methods and Gurobi for CATS problems with different bid distributions.
30 Obstacles 40 Obstacles
Gurobi
Random-LNS
BC-LNS
FT-LNS
Table 4: Comparison of learning-based methods with the random method and Gurobi for the risk-aware path planning problems with different number of obstacles.

5.3 Comparison with Domain-Specific Heuristics

We also compare with strong domain-specific heuristics for three classes of problems: MVC, MAXCUT and CATS. We do not compare in the risk-aware path planning domain, as there are no readily available heuristics. Overall, we find that our LNS methods are competitive with specially designed heuristics, and can sometimes substantially outperform them. These results provide evidence that our LNS approach is a promising direction for the automated design of solvers that avoids the need to carefully integrate domain knowledge while achieving competitive or state-of-the-art performance.

MVC BA 1000 MVC ER 1000
Local-ratio
Best-LNS
Table 5: Comparison between LNS with the local-ratio heuristic for MVC.

For MVC, we compare with a 2-OPT heuristic based on local-ratio approximation (Bar-Yehuda and Even, 1983). Table 5 summarizes the results. The best LNS result outperforms by on BA graphs and on ER graphs.

For MAXCUT, we compare with 3 heuristics. The first is the greedy algorithm that iteratively moves vertices from one cut set to the other based on whether such a movement can increase the total edge weights. The second, proposed in Burer et al. (2002), is based on a rank-two relaxation of an SDP. The third is from de Sousa et al. (2013). The results are presented in Table 6. The SDP-based heuristic performs best for both random graph distributions, which shows that a specially designed heuristic can still outperform a general IP solver.

For CATS, we consider 2 heuristics. The first is greedy: at each step, we accept the highest bid among the remaining bids, remove its desired items and eliminate other bids that desire any of the removed items. The second is based on LP rounding: we first solve the LP relaxation of the IP formulation of a combinatorial auction problem, and tThen we move from the bid having the largest fractional value in the LP solution down and remove items/bids in the same manner. As shown in Table 7, Our LNS approach outperforms both methods by up to in objective values.

MAXCUT BA 500 MAXCUT ER 500
Greedy
Burer
De Sousa
Best-LNS
Table 6: Comparison between LNS with three heuristics for MAXCUT.
CATS Regions 2000 CATS Regions 4000 CATS Arbitrary 2000 CATS Arbitrary 4000
Greedy
LP Rounding
Best-LNS
Table 7: Comparison between LNS with greedy and LP rounding heuristics for combinatorial auctions.

5.4 Visualization

(a) Iteration 2
(b) Iteration 3
(c) Iteration 4
(d) Iteration 5
Figure 3: Visualizing predicted decompositions in a risk-aware path planning problem, with 4 consecutive solutions after 3 iterations of LNS. Each blue square is an obstacle and each cross is a waypoint. The obstacles in red and waypoints in dark blue are the most frequent ones in the subsets that lead to high local improvement.

A natural question is what property a good decomposition has. Here we provide one interpretation for the risk-aware path planning. We use a slightly smaller instance with 20 obstacles for a clearer view. Binary variables in an IP formulation of this problem model relationships between obstacles and waypoints. Thus we can interpret the neighborhood formed by a subset of binary variables as attention over specific relationships among some obstacles and waypoints.

Figure 3 captures 4 consecutive iterations of LNS with large solution improvements. Each sub-figure contains information about the locations of obstacles (light blue squares) and the waypoint locations after the current iteration of LNS. We highlight a subset of 5 obstalces (red circles) and 5 waypoints (dark blue squares) that appear most frequently in the first neighborhood of the current decomposition. Qualitatively, the top 5 obstacles define some important junctions for waypoint updates. For waypoint updates, the highlighted ones tend to have large changes between iterations. Thus, a good decomposition focuses on important decision regions and allows for large updates in these regions.

5.5 Limitations

While we have presented positive results on our proposed general LNS method, limitations exist. A major one is that neighborhoods need to be large for LNS to work well. For example, if we compare Random-LNS with Gurobi under the same wall-clock time limit on a combinatorial auction dataset with only 500 items and 1000 bids. Gurobi outperforms Random-LNS on the objective value for about (). This means that LNS is most effective in dealing with large scale problem instances, not necessarily for small ones. As a result, it is important to evaluate whether a problem is of the scale where the proposed LNS method is useful.

6 Conclusion & Future Work

We have presented a general large neighborhood search framework for solving integer programs. Our extensive benchmarks show the proposed method consistently outperform Gurobi in wall-clock time across diverse applications. Our method is also competitive with strong heuristics in specific problem domains.

We believe our current research has many exciting future directions that connect different aspects of the learning to optimize research. Our framework relies on a good solver thus the progress from the learning to search community can lead to better LNS results. We have briefly mentioned a version of the algorithm configuration problem we face in Section 5.1, but the full version that adaptively chooses the number of subsets and time limit is surely more powerful. Algorithm configuration can also help optimize the hyperparameters of a solver for solving each sub-problem. Finally, our approach is closely tied to the effort to identify substructures in optimization problems. A deeper understanding of them can inform the design our data-driven methods and define new variants of the learning problem within the proposed framework, e.g., learning to identify backdoor variables.

References

  • T. Achterberg, R. E. Bixby, Z. Gu, E. Rothberg, and D. Weninger (2019) Presolve reductions in mixed integer programming. INFORMS Journal on Computing. Cited by: §1.
  • T. Achterberg (2009) SCIP: solving constraint integer programs. Mathematical Programming Computation. Cited by: §2.1.
  • R. K. Ahuja, Ö. Ergun, J. B. Orlin, and A. P. Punnen (2002) A survey of very large-scale neighborhood search techniques. Discrete Applied Mathematics 123 (1-3), pp. 75–102. Cited by: §3.
  • R. Albert and A. Barabási (2002) Statistical mechanics of complex networks. Reviews of modern physics 74 (1), pp. 47. Cited by: §5.1.
  • C. Ansótegui, Y. Malitsky, H. Samulowitz, M. Sellmann, and K. Tierney (2015)

    Model-based genetic algorithms for algorithm configuration

    .
    In AISTATS, Cited by: §2.2.
  • M. Balcan, T. Dick, T. Sandholm, and E. Vitercik (2018) Learning to branch. In ICML, Cited by: §2.2.
  • R. Bar-Yehuda and S. Even (1983) A local-ratio theorm for approximating the weighted vertex cover problem. Technical report Computer Science Department, Technion. Cited by: §5.3.
  • B. Bérard, M. Bidoit, A. Finkel, F. Laroussinie, A. Petit, L. Petrucci, and P. Schnoebelen (2013) Systems and software verification: model-checking techniques and tools. Cited by: §1.
  • T. Berthold (2006) Primal heuristics for mixed integer programs. Cited by: §1.
  • C. Bliek1ú, P. Bonami, and A. Lodi (2014) Solving mixed-integer quadratic programming problems with ibm-cplex: a progress report. In RAMP, Cited by: §2.1.
  • S. Burer, R. D. Monteiro, and Y. Zhang (2002) Rank-two relaxation heuristics for max-cut and other binary quadratic programs. SIAM Journal on Optimization 12 (2), pp. 503–521. Cited by: §5.3.
  • B. Chen, B. Dai, and L. Song (2020) Learning to plan via neural exploration-exploitation trees. In ICLR, Cited by: §2.1.
  • J. Cui, Y. S. Chu, O. O. Famodu, Y. Furuya, J. Hattrick-Simpers, R. D. James, A. Ludwig, S. Thienhaus, M. Wuttig, Z. Zhang, et al. (2006) Combinatorial search of thermoelastic shape-memory alloys with extremely small hysteresis width. Nature materials. Cited by: §1.
  • L. De Moura and N. Bjørner (2008) Z3: an efficient smt solver. In TACAS, Cited by: §1.
  • S. de Sousa, Y. Haxhimusa, and W. G. Kropatsch (2013) Estimation of distribution algorithm for the max-cut problem. In

    International Workshop on Graph-Based Representations in Pattern Recognition

    ,
    pp. 244–253. Cited by: §5.3.
  • S. De Vries and R. V. Vohra (2003) Combinatorial auctions: a survey. INFORMS Journal on computing. Cited by: §1.
  • B. Dilkina, C. P. Gomes, Y. Malitsky, A. Sabharwal, and M. Sellmann (2009a) Backdoors to combinatorial optimization: feasibility and optimality. In CPAIOR, Cited by: §1, §2.3.
  • B. Dilkina, C. P. Gomes, and A. Sabharwal (2009b) Backdoors in the context of learning. In SAT, Cited by: §2.3.
  • M. Dorigo, M. Birattari, and T. Stutzle (2006) Ant colony optimization. IEEE computational intelligence magazine 1 (4), pp. 28–39. Cited by: §3.1.
  • P. Erdős and A. Rényi (1960) On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. Cited by: §5.1.
  • M. Fischetti and A. Lodi (2010) Heuristics in mixed integer programming. Wiley Encyclopedia of Operations Research and Management Science. Cited by: §1.
  • M. Gasse, D. Chételat, N. Ferroni, L. Charlin, and A. Lodi (2019)

    Exact combinatorial optimization with graph convolutional neural networks

    .
    In NeurIPS, Cited by: §1, §2.1.
  • H. He, H. Daume III, and J. M. Eisner (2014) Learning to search in branch and bound algorithms. In NeurIPS, Cited by: §1, §2.1.
  • S. Helber and F. Sahling (2010) A fix-and-optimize approach for the multi-level capacitated lot sizing problem. International Journal of Production Economics 123 (2), pp. 247–256. Cited by: §3.1.
  • K. Helsgaun (2009) General k-opt submoves for the lin–kernighan tsp heuristic. Mathematical Programming Computation 1 (2-3), pp. 119–163. Cited by: §3.1.
  • G. Hendel (2018) Adaptive large neighborhood search for mixed integer programming. Cited by: §3.1.
  • H. H. Hoos (2011) Automated algorithm configuration and parameter tuning. In Autonomous search, pp. 37–71. Cited by: §2.2.
  • A. Hottung and K. Tierney (2019) Neural large neighborhood search for the capacitated vehicle routing problem. arXiv:1911.09539. Cited by: §3.1.
  • F. Hutter, H. H. Hoos, and K. Leyton-Brown (2011) Sequential model-based optimization for general algorithm configuration. In LION, Cited by: §2.2.
  • E. B. Khalil, B. Dilkina, G. L. Nemhauser, S. Ahmed, and Y. Shao (2017a) Learning to run heuristics in tree search. In IJCAI, Cited by: §1, §2.1.
  • E. B. Khalil, P. Le Bodic, L. Song, G. Nemhauser, and B. Dilkina (2016) Learning to branch in mixed integer programming. In AAAI, Cited by: §1, §2.1, §2.1.
  • E. Khalil, H. Dai, Y. Zhang, B. Dilkina, and L. Song (2017b) Learning combinatorial optimization algorithms over graphs. In NeurIPS, Cited by: §2.1, §4.3.
  • Y. Kim, S. Wiseman, A. C. Miller, D. Sontag, and A. M. Rush (2018)

    Semi-amortized variational autoencoders

    .
    In ICML, Cited by: §2.3.
  • R. Kleinberg, K. Leyton-Brown, B. Lucier, and D. Graham (2019) Procrastinating with confidence: near-optimal, anytime, adaptive algorithm configuration. In NeurIPS, Cited by: §2.2.
  • A. H. Land and A. G. Doig (2010) An automatic method for solving discrete programming problems. In 50 Years of Integer Programming 1958-2008, pp. 105–132. Cited by: §1, §2.1.
  • K. Leyton-Brown, M. Pearson, and Y. Shoham (2000) Towards a universal test suite for combinatorial auction algorithms. In EC, Cited by: §5.1.
  • Z. Manna and R. J. Waldinger (1971) Toward automatic program synthesis. Communications of the ACM 14 (3), pp. 151–165. Cited by: §1.
  • A. Mirhoseini, H. Pham, Q. V. Le, B. Steiner, R. Larsen, Y. Zhou, N. Kumar, M. Norouzi, S. Bengio, and J. Dean (2017) Device placement optimization with reinforcement learning. In ICML, Cited by: §1.
  • H. D. Mittelmann (2017) Latest benchmarks of optimization software. Cited by: §1, §2.1.
  • M. Ono and B. C. Williams (2008)

    An efficient motion planning algorithm for stochastic dynamic systems with constraints on probability of failure

    .
    In AAAI, Cited by: §1, §5.1.
  • G. Optimization (2019) Gurobi 8 performance benchmarks. Note: https://www.gurobi.com/pdfs/benchmarks.pdf Cited by: §1, §2.1.
  • D. Pisinger and S. Ropke (2010) Large neighborhood search. Handbook of metaheuristics. Cited by: §3.
  • D. A. Pomerleau (1989) Alvinn: an autonomous land vehicle in a neural network. In NeurIPS, Cited by: §4.2.
  • S. Ross and D. Bagnell (2010) Efficient reductions for imitation learning. In AISTATS, Cited by: §4.2.
  • E. Rothberg (2007)

    An evolutionary algorithm for polishing mixed integer programming solutions

    .
    INFORMS Journal on Computing 19 (4), pp. 534–541. Cited by: §3.1, §3.2.
  • P. Shaw (1998) Using constraint programming and local search methods to solve vehicle routing problems. In CP, Cited by: §3.
  • J. Song, R. Lanka, Y. Yue, and M. Ono (2019) Co-training for policy learning. In UAI, Cited by: §1, §2.1.
  • J. Song, R. Lanka, A. Zhao, Y. Yue, and M. Ono (2018) Learning to search via retrospective imitation. arXiv:1804.00846. Cited by: §1, §2.1, §2.1.
  • R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour (2000) Policy gradient methods for reinforcement learning with function approximation. In NeurIPS, Cited by: §4.1.
  • A. A. Syed, K. Akhnoukh, B. Kaltenhaeuser, and K. Bogenberger (2019) Neural network based large neighborhood search algorithm for ride hailing services. In EPIA, Cited by: §3.1.
  • M. Wainwright, T. Jaakkola, and A. Willsky (2005)

    MAP estimation via agreement on trees: message-passing and linear programming

    .
    IEEE transactions on information theory. Cited by: §1.
  • L. Zhang and S. Malik (2002) The quest for efficient boolean satisfiability solvers. In CAV, Cited by: §1.

Appendix A Appendix

a.1 More Algorithm Configuration Results

In this section, we present the more algorithm configuration results similar to the one in Section 5.1.

1 2 3
2
3
4
5
Table 8: Parameter sweep results for of the MVC dataset for Barabási-Albert random graphs with 1000 vertices.
1 2 3
2
3
4
5
Table 9: Parameter sweep results for of the MAXCUT dataset for Erdős-Rényi random graphs with 500 vertices.
1 2 3
2
3
4
5
Table 10: Parameter sweep results for of the MAXCUT dataset for Barabási-Albert random graphs with 500 vertices.
1 2 3
2
3
4
5
Table 11: Parameter sweep results for of the CATS dataset for the regions distribution with 2000 items and 4000 bids.
1 2 3
2
3
4
5
Table 12: Parameter sweep results for of the CATS dataset for the arbitrary distribution with 2000 items and 4000 bids.
1 2 3
2
3
4
5
Table 13: Parameter sweep results for of the risk-aware path planning for 30 obstacles.