1 Introduction
Industrial scheduling has been one of the most investigated combinatorial problems since the Sixties[12]. Since then, many formal definitions of such problem have been given (e.g. jobshop, openshop, flowshop), in order to extrapolate the core aspects of the problem and neglect the insignificant ones.
The jobshop scheduling problem[2] gained particular fame due to its easy formulation leading to instances hard to solve optimally. The most typical optimization criteria is the minimization of the makespan, i.e. the time interval between the start of the first operation and the end of the last. The problem is presented as a set of jobs that must be processed by a set of machines. Each job is a sequence of operations, each operation has to be processed by a specific machine and takes a certain processing time. Every job has a specific order of operations that must be respected. An admissible solution for this problem is a sequence of operations on every machine where there is no time overlap between two operations in the same machine and the orders of the operations are respected.
Due to its combinatorial structure, it comes natural to represent this problem as a constraint satisfaction problem. In fact, constraintbased approaches have been applied successfully to jobshop problems over the past years[3, 7, 13].
A more recent technique is Large Neighboorhood Search (LNS)[8], which consists in a continuous relaxation and reoptimization of the problem, allowing iterative improvements of the solution. This idea was also applied to MIP approaches (in the form of Relaxation Induced Neighborhood Search [6]). In fact, hybrid approaches with CP and MIP were proposed [9], which were used in case of nonregular objective function (like in case of earliness costs).
Despite these advancements in constraint solving, the last decade has experienced a decrease of research interest of CP applied to jobshop. Part of the problem is that the benchmarks widely used in literature (See Section 2.2) are typically more than 20 years old and are not up to date with the current industrial demands. In fact, nowadays industry can easily require up to 2000 jobs to be scheduled on 100 machines[4]. In comparison, the biggest instance of the Taillard benchmark[15], which reflected real dimensions of industrial problems in 1993 and it is still among the largest available, has 50 jobs on 20 machines.
The defacto leader on the scheduling scene of the last years is IBM, with their proprietary solver CP Optimizer. This solver was capable of finding better solutions for many jobshop instances from the classic benchmarks[18], as well as targeting industrialsize instances from the IBM scheduling benchmark, with instances up to 1 million activities[10]. However, these instances are not publicly available.
In this paper we investigate the capabilities of the best available CP solvers on both classic benchmarks from the literature as well as on industrialsize instances with proven optima[16]. By doing so, we aim to close the gap on the jobshop research of the last years.
As anticipated, one of the most successful CP solvers on the scheduling problems is CP Optimizer (abbreviated CPO). To find a worthy opponent, we took the winner of the MiniZinc Challenge 2018^{1}^{1}1https://www.minizinc.org/challenge2018/challenge.html. The MiniZinc challenge is a recurring competition where all the constraint solvers that support the MiniZinc modeling language compete on various combinatorial problems, including scheduling. ORTools^{2}^{2}2https://developers.google.com/optimization/ (ORT), an opensource solver developed by Google, won the gold medal in all categories in 2018. This paper is in the same line of research as [5]. However, in opposition to [5], we use a largescale benchmark with proven optima herein. Furthermore, we extend the experimental setting such that, additionally to single core experiments, we also report on experiments using four processing cores (quad core).
2 Experimental Setup
The goal of the experiment is to compare the solving capabilities of IBM’s CP Optimizer and Google’s ORTools in jopshop problem instances with respect to quality of the solutions (makespan) and solving time. The solvers compete on two benchmarks: one composed by classic instances from the literature, and the other is a largescale benchmark with known optimal solutions.
Concerning the classic benchmark, the comparison follows the rules of the MiniZinc challenge; solvers are given 20 minutes per problem instance. Concerning the largescale benchmark we give 6 hours to complete the search. In fact, our aim is to simulate and industrial scenario, where the calculation of the schedule for the day is typically done overnight. We test the performance of the solvers with, both, a single core configuration and a quad core configuration.
Concerning the solvers’ version, we use version 12.8.0 for CP Optimizer and version 6.10.6025 for ORTools. In CPO we selected the default search parameters, which turned out to be the most effective after a preliminary test. In ORT we decided to use the CPSAT solver, because the old CP solver is not updated any more by the Google researchers, and because CPSAT proved to be better on average after a pretest. The experiment is conducted on a system equipped with a 2 GHz AMD EPYC 7551P 32 Cores CPU and 256 GB of RAM.
2.1 Models
There are various ways to model the jobshop problem. MiniZinc, one of the most famous CP modeling languages, is supported by ORT but it is not its native modeling language, while OPL is the native modeling language for CPO as it does not require further translation. However, both programs offer Java APIs to interface with the CP solver. To avoid bias and to make the solvers’ comparison as fair as possible, we used Java to model the problem for both cases, using the same constructs and constraints. In fact, both models take advantage of the Interval Variables, a problem specific variable type that is well suited to represent job operations, because it automatically ensures that for each operation, . Each machine contains a nooverlap constraint, which roughly corresponds to a cumulative constraint with the capacity set to 1. The following snippet shows the pseudocode for the model implementations^{3}^{3}3complete encodings and benchmarks are available at https://goo.gl/qarP3m:
2.2 Problem Instances
Our test for the models are conducted on the classic benchmark and the largescale benchmark. All the instances of the classic benchmark are rectangular jobshop instances. This means that every job has to go through all the machines, therefore every job will have a number of operations equal to the total number of machines and every machine will have assigned a number of operations equal to the total number of jobs. The classic benchmark^{4}^{4}4https://github.com/MiniZinc/minizincbenchmarks consists of 74 problem instances selected from the most used jobshop benchmarks in the literature:

FT: This benchmark is one of the oldest for jobshop scheduling, and is defined in the book ”Industrial Scheduling”[12]. It includes 3 problem instances of sizes 6x6, 10x10 and 20x5. The square instance 10x10 is famous for remaining unsolved for more than 20 years.

LA: This benchmark contains 40 problem instances from 10x5 to 30x10 [11].

ABZ: 5 problem instances from the work about shifting bottleneck by [2].

ORB: 10 problem instances proposed by [3].

YN: 1 randomly generated problem instances of size 20x20 [19].

SWV: A set of 14 problem instances from [14].

VW: 1 instance from [17].
The instances of the largescale benchmark are 24 instances divided in 8 groups of 3 by size, from 100 to 1000 machines and from 10 000 to 100 000 operations. All the instances have the optimum makespan set to 600 000 seconds, which roughly corresponds to a week. Furthermore, there are 2 types of instances:

Long jobs: Less jobs with longer chain of operations;

Short jobs: More jobs with shorter chain of operations;
Full specification of the benchmark can be found in[16].
single core  quad core  
CPO  ORT  CPO  ORT  
Inst.  msp (secs)  msp (secs)  msp (secs)  msp (secs) 
abz5  1234 (1.9)  1234 (1.8)  1234 (3.3)  1234 (1.6) 
abz6  943 (0.7)  943 (0.7)  943 (1.4)  943 (0.4) 
abz7  656 (1169.3)  660  656 (525)  661 (1200) 
abz8  682  679  680  679 
abz9  685  695  694  689 
ft06  55 (0)  55 (0)  55 (0)  55 (0) 
ft10  930 (3.8)  930 (5)  930 (5.9)  930 (2.9) 
ft20  1165 (1.4)  1165 (5)  1165 (0.5)  1165 (3.4) 
la01  666 (0)  666 (0.1)  666 (0)  666 (0.1) 
la02  655 (0.3)  655 (0.1)  655 (0.5)  655 (0.1) 
la03  597 (0.1)  597 (0.1)  597 (0.1)  597 (0) 
la04  590 (0.4)  590 (0.2)  590 (0.3)  590 (0.1) 
la05  593 (0)  593 (0)  593 (0)  593 (0.1) 
la06  926 (0)  926 (1.1)  926 (0)  926 (0.4) 
la07  890 (0)  890 (0.1)  890 (0.1)  890 (0.2) 
la08  863 (0)  863 (0.2)  863 (0)  863 (0.1) 
la09  951 (0)  951 (0.5)  951 (0)  951 (0.2) 
la10  958 (0)  958 (0.9)  958 (0)  958 (0.1) 
la11  1222 (0)  1222 (0.6)  1222 (0)  1222 (0.2) 
la12  1039 (0.1)  1039 (0.6)  1039 (0.2)  1039 (0.3) 
la13  1150 (0)  1150 (2.7)  1150 (0)  1150 (0.4) 
la14  1292 (0)  1292 (1.9)  1292 (0)  1292 (0.3) 
la15  1207 (0.1)  1207 (5.8)  1207 (0.2)  1207 (1.8) 
la16  945 (1.5)  945 (0.6)  945 (1.8)  945 (0.4) 
la17  784 (1.1)  784 (0.4)  784 (1.4)  784 (0.3) 
la18  848 (0.9)  848 (1)  848 (1.3)  848 (0.6) 
la19  842 (2.9)  842 (1.7)  842 (3.4)  842 (1.1) 
la20  902 (1.6)  902 (0.7)  902 (1.4)  902 (0.6) 
la21  1046 (22.9)  1046 (83.4)  1046 (51.4)  1046 (84.4) 
la22  927 (5.1)  927 (6.2)  927 (3.8)  927 (4.5) 
la23  1032 (0.1)  1032 (2.9)  1032 (0.4)  1032 (1.7) 
la24  935 (15.4)  935 (24.8)  935 (12.9)  935 (14.2) 
la25  977 (14.5)  977 (19.1)  977 (19.6)  977 (24.2) 
la26  1218 (7.4)  1218 (79.8)  1218 (0.8)  1218 (11.5) 
la27  1235 (127.6)  1235 (509.9)  1235 (1077.9)  1235 (479.9) 
la28  1216 (17.7)  1216 (14.5)  1216 (6.6)  1216 (7.3) 
la29  1152  1153  1152  1152 
single core  quad core  
CPO  ORT  CPO  ORT  
Inst.  msp (secs)  msp (secs)  msp (secs)  msp (secs) 
la30  1355 (0.3)  1355 (21.2)  1355 (0.7)  1355 (8.3) 
la31  1784 (0.4)  1784 (24.1)  1784 (0.5)  1784 (11.6) 
la32  1850 (0)  1850 (29.4)  1850 (0.1)  1850 (20.5) 
la33  1719 (0.3)  1719 (14.6)  1719 (0.4)  1719 (35.3) 
la34  1721 (1.6)  1721 (69.5)  1721 (1)  1721 (31.5) 
la35  1888 (0.2)  1888 (25.7)  1888 (0.4)  1888 (14.7) 
la36  1268 (10.4)  1268 (11.3)  1268 (6.8)  1268 (6.6) 
la37  1397 (4)  1397 (8.7)  1397 (8.4)  1397 (4.6) 
la38  1196 (85.1)  1196 (265.1)  1196 (108.6)  1196 (134.7) 
la39  1233 (5.9)  1233 (14.2)  1233 (10.6)  1233 (5.3) 
la40  1222 (10)  1222 (53.2)  1222 (31)  1222 (30.5) 
orb01  1059 (7.1)  1059 (22.9)  1059 (9.6)  1059 (19.6) 
orb02  888 (2.2)  888 (2)  888 (3.3)  888 (1.2) 
orb03  1005 (6.6)  1005 (20.9)  1005 (17.8)  1005 (10.7) 
orb04  1005 (2.8)  1005 (3)  1005 (3.8)  1005 (1.9) 
orb05  887 (3.6)  887 (2.4)  887 (3.9)  887 (2.3) 
orb06  1010 (4.7)  1010 (8.7)  1010 (4.7)  1010 (6.6) 
orb07  397 (1.6)  397 (1.4)  397 (2)  397 (1.1) 
orb08  899 (1.1)  899 (1.4)  899 (1.5)  899 (1) 
orb09  934 (1.2)  934 (1.5)  934 (2)  934 (1.1) 
orb10  944 (0.7)  944 (1.9)  944 (1.2)  944 (1.1) 
swv01  1445  1412  1407 (909)  1415 
swv02  1491  1475 (906.2)  1475 (863)  1475 (192.5) 
swv03  1420  1410  1398 (938.8)  1415 
swv04  1520  1482  1517  1488 
swv05  1424 (1138.5)  1436  1427  1430 
swv06  1728  1746  1723  1722 
swv07  1672  1677  1690  1653 
swv08  1785  1855  1872  1832 
swv09  1713  1715  1733  1694 
swv10  1823  1807  1810  1814 
swv11  3041  3317  3095  3239 
swv12  3114  3358  3056  3312 
swv13  3205  3421  3161  3321 
swv14  3032  3162  2985  3095 
vw3x3  256 (0)  256 (0)  256 (0)  256 (0) 
yn4  980  994  993  994 
3 Results
Table 1 shows the results of CP Optimizer compared to ORTools running on the classic benchmark, on single core and quad core configurations. Since the dataset is large, we adapted the results in two columns. In the cells we indicate the best makespans achieved before the timeout occurs. If the optimal makespan is achieved, the search stops, therefore we show the actual solving time in parenthesis. To detect whether a solution is optimal, the solver calculates a lower bound, i.e
. an estimate of the objective below which it is impossible to find a solution (typically a solution of the relaxed scheduling problem). When the solution found is equal to the lower bound, the solution is optimal.
Concerning the single core, CP Optimizer was able to find a better solution than ORTools in 13 out of 74 problem instances (about of the instances). ORTools found better solutions in 6 cases, about of the total. CP Optimizer was faster of the time, ORTools , and in all the other cases both solvers reached the timeout of 1200 seconds. If we would adopt the scoring system of the MiniZinc Challenge^{5}^{5}5We used the complete scoring procedure as described in https://www.minizinc.org/challenge2018/rules2018.html, CP Optimizer would score points, while ORTools would score points. CP Optimizer solved optimally 59 problem instances (), compared to 58 problem instances () of ORTools. In particular, ORTools was able to exclusively find the optimum in instance swv02, while CP Optimizer exclusively found the optimal solution in swv05 and abz7.
By exploiting multi core, both solvers were able to slightly improve their solutions. For example, it allowed CPO to find the optimum on the instances swv01, swv02 and swv03, or to find the optimum on abz7 in half of the solving time. Also ORT benefited from the additional cores, being able to find the optimum on swv02 in a quarter of the time.
Table 2 shows the results on the largescale benchmark with known optima. In the single core experiment, CPO is able to solve optimally 16 out of 24 instances (). In general, it was able to solve almost all the short job instances to optimality (beside 2 instances, which were still very close to optimal solution). Concerning the long job instances, all the instances with 10 000 operations reached the optimal solution, while it was never possible to hit the optimum in the cases with 100 000 operations. The hardest instances to solve were the one with long jobs, 100 machines and 100 000 operations. The worst result achieved was less than 80 % off the optimum.
Concerning ORT in single core, it was not possible to solve any of the instances to optimality. In two occasions, namely long1000100001 and short1000100003, it was not possible to find any admissible solution within the timeout. Beside that, the hardest instances were the long jobs with 100 000 operations, where the worst result of 206 % off the optimum (more than 3 times the optimal makespan) was scored. In general, better performance were achieved in the short jobs instances, compared to the long ones. In fact, ORT scored on average 40 % off the optimum on the short job instances and 154 % off on the long ones (excluding the timeout cases). The best result is registered on a short job instance with 100 machines and 100 000 operations, which is 10 % off the optimum.
Concerning the quad core test, however, things changes dramatically. ORT is able to find the optimum in 7 of the long instances, even beating CPO in two 1000100000 instances. Some improvements were also registered on the other instances. CPO, apparently, does not benefit from the quad core as much as ORT. In fact, the solutions found were just marginally better or even worst than the single core counterparts.
single core  quad core  
Instance  CPO  ORT  CPO  ORT 
typenumMachinesnumOpsid  msp (secs)  msp (secs)  msp (secs)  msp (secs) 
longJobs100100001  600000 (8096)  1390577  600000 (6913)  600000 (188) 
longJobs100100002  600000 (10399)  1463638  600000 (7631)  600000 (580) 
longJobs100100003  600000 (10294)  1435995  600000 (8339)  600000 (226) 
longJobs1001000001  1077736  1646792  1077862  1642753 
longJobs1001000002  1066971  1628456  1066438  1618288 
longJobs1001000003  1070306  1644806  1070616  1636805 
longJobs1000100001  600000 (2)  No Solution  600000 (3)  No Solution 
longJobs1000100002  600000 (1)  1162719 (4260)  600000 (3)  600000 (7) 
longJobs1000100003  600000 (2)  1081297  600000 (2)  600000 (2) 
longJobs10001000001  807297  1722413  749737  600000 (563) 
longJobs10001000002  818596  1838357  817481  600000 (3002) 
longJobs10001000003  837938  1736608  839195  1738491 
shortJobs100100001  600000 (12)  788640  600000 (17)  762347 
shortJobs100100002  600000 (12)  739425  600000 (24)  741028 
shortJobs100100003  600000 (19)  752895  600000 (29)  735739 
shortJobs1001000001  600000 (4384)  652436  600000 (5281)  650084 (5389) 
shortJobs1001000002  600000 (4377)  650084  600000 (5227)  No Solution 
shortJobs1001000003  600000 (4287)  661374  600000 (5435)  6611374 (6467) 
shortJobs1000100001  600000 (20026)  1405776  600000 (19865)  1068441 
shortJobs1000100002  600000 (16538)  1103354  600000 (17375)  1027733 
shortJobs1000100003  603447  No Solution  600699  No Solution 
shortJobs10001000001  600000 (20956)  822552  600147  790129 
shortJobs10001000002  600000 (16094)  795075  600057  791255 
shortJobs10001000003  600106  808808  600142  805050 
4 Conclusion
CPO proved to perform better in general on the classic benchmark and especially on the largescale benchmark. ORT benefited more than CPO from the quad core configuration, which allowed ORT to find optimal solutions even in the largescale benchmark.
To explain the difference in performance, we analized the differences of the two solvers. While both use interval variables to express the job operations, CPO uses basic types to encode the intervals, while ORT uses three variables for a single interval, slowing down the constraint propagation. The two solvers use a similar search strategy based on large neighbourhood search (LNS), which consists in an iterative relaxation and reoptimization of the scheduling problem. However, while CPO uses portfolio strategies in combination with machine learning to converge to the best neighbourhoods, ORT uses a much more simplistic approach based on random variables/constraint selection.
Furthermore, CPO uses a “plan B” strategy called failure directed search (FDS), which is triggered when the LNS is not able to improve the current solution. However, we tested the impact of FDS by rerunning the experiment and switching off FDS, and we found the impact to be limited on the classic benchmark (some instances improved, some worsen, many were the same) and not existing on the big instances (some actually slightly improved without FDS).
Concluding, CP Optimizer performed slightly better than ORTools on the classic benchmark, but was absolutely superior on the largescale one. In fact, CP Optimizer was able to optimally solve 66% of the largescale instances, against 29% of ORTools (quad core). By exploiting multi cores, ORTools was also able to find optimal solutions for the largescale instances, showing that nowadays CP solvers in general could be successfully applied to realworld industrial problems.
References
 [1]
 [2] Joseph Adams, Egon Balas & Daniel Zawack (1988): The Shifting Bottleneck Procedure for Job Shop Scheduling. Management Science 34(3), pp. 391–401, doi:http://dx.doi.org/10.1287/mnsc.34.3.391. Available at http://www.jstor.org/stable/2632051.
 [3] David Applegate & William Cook (1991): A Computational Study of the JobShop Scheduling Problem. ORSA Journal on Computing 3(2), pp. 149–156, doi:http://dx.doi.org/10.1287/ijoc.3.2.149.

[4]
Giacomo Da Col &
Erich C Teppan
(2016): Declarative decomposition and
dispatching for largescale jobshop scheduling.
In:
Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz)
, Springer, pp. 134–140, doi:http://dx.doi.org/10.1007/9783319460734_11.  [5] Giacomo Da Col & Erich C Teppan (2019): Industrial Size JobShop Scheduling tackled by PresentDay CP Solvers. In: CP 2019: Principles and Practice of Constraint Programming.
 [6] Emilie Danna, Edward Rothberg & Claude Le Pape (2003): Integrating mixed integer programming and local search: A case study on jobshop scheduling problems. In: Fifth International Workshop on Integration of AI and OR techniques in Constraint Programming for Combinatorial Optimisation Problems (CPAIOR’2003), pp. 65–79.
 [7] Mark S Fox, Bradley P Allen & Gary Strohm (1982): JobShop Scheduling: An Investigation in ConstraintDirected Reasoning. In: AAAI, pp. 155–158.
 [8] Philippe Laborie & Daniel Godard (2007): Selfadapting large neighborhood search: Application to singlemode scheduling problems. Proceedings MISTA07, Paris 8.
 [9] Philippe Laborie & Jérôme Rogerie (2016): Temporal linear relaxation in IBM ILOG CP Optimizer. Journal of Scheduling 19(4), pp. 391–400, doi:http://dx.doi.org/10.1007/s1095101404087.
 [10] Philippe Laborie, Jérôme Rogerie, Paul Shaw & Petr Vilím (2018): IBM ILOG CP optimizer for scheduling. Constraints 23(2), pp. 210–250, doi:http://dx.doi.org/10.1007/s106010189281x.

[11]
S. Lawrence (1984):
Resource constrained project scheduling: an experimental investigation of heuristic scheduling techniques (Supplement)
. Graduate School of Industrial Administration, CarnegieMellon University.  [12] J.F. Muth & G.L. Thompson (1963): Industrial Scheduling. International series in management, PrenticeHall. Available at https://books.google.at/books?id=A5AgAAAAMAAJ.
 [13] Norman M. Sadeh & Mark S. Fox (1996): Variable and value ordering heuristics for the job shop scheduling constraint satisfaction problem. Artificial Intelligence 86, pp. 1–41, doi:http://dx.doi.org/10.1016/00043702(95)000984.
 [14] Robert H Storer, S David Wu & Renzo Vaccari (1992): New search spaces for sequencing problems with application to job shop scheduling. Management science 38(10), pp. 1495–1509, doi:http://dx.doi.org/10.1287/mnsc.38.10.1495.
 [15] Eric Taillard (1993): Benchmarks for basic scheduling problems. european journal of operational research 64(2), pp. 278–285, doi:http://dx.doi.org/10.1016/03772217(93)90182M.

[16]
Erich C Teppan &
Giacomo Da Col
(2018):
Automatic Generation of Dispatching Rules for Large Job Shops by Means of Genetic Algorithms
. In: 8th International Workshop on Combinations of Intelligent Methods and Applications (CIMA 2018), pp. 43–57. 
[17]
Manuel Vazquez &
L Darrell Whitley
(2000): A comparison of genetic
algorithms for the dynamic job shop scheduling problem.
In:
Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation
, Morgan Kaufmann Publishers Inc., pp. 1011–1018. 
[18]
Petr Vilím,
Philippe Laborie &
Paul Shaw (2015):
Failuredirected search for constraintbased
scheduling.
In:
International Conference on AI and OR Techniques in Constriant Programming for Combinatorial Optimization Problems
, Springer, pp. 437–453, doi:http://dx.doi.org/10.1007/9783319180083_30.  [19] Takeshi Yamada & Ryohei Nakano (1992): A Genetic Algorithm Applicable to LargeScale JobShop Problems. In: PPSN, pp. 283–292.
Comments
There are no comments yet.