1.1 Basic notation and problem definition
Given a simple graph and a set of colors , define a coloring as a function which assigns to each vertex a color . A coloring is said to be proper if for every . An example of proper coloring is illustrated in Figure 1.
Given a coloring , define to be a -vertex if has at least one neighbor with each color in , more precisely, for every . A coloring is said to be a -coloring if every color in has at least one associated -vertex. Examples of -colorings are illustrated in Figure 2. Alternatively, define color classes of as the parts of a partition of into independent sets for each . A vertex with is called a b-vertex for color if has a neighbor representing every other color class, i.e., for all . In this alternative definition, a -coloring is a proper coloring such that every color class has a b-vertex.
The chromatic number of a graph , , is the minimum number of colors needed to properly color . The -chromatic number of a graph , , is the maximum number of colors for which admits a -coloring. The coloring problem consists in encountering a proper coloring of a graph minimizing the number of colors. The -coloring problem consists in encountering a proper -coloring of a graph maximizing the number of colors. The problem of finding was shown to be NP-hard in IrvMan99, thus the -coloring problem is NP-hard.
Although the -coloring and the coloring problems appear to be closely related, they have several differences. First of all, they consider the objective functions in opposite directions and the difference between their optimal solution values can be arbitrarily large Kratochvíl et al. (2002). Furthermore, the -coloring problem can be largely influenced by the girth (length of a shortest cycle) of the graph, what is not exactly the case for the coloring problem Campos et al. (2013). Besides, a property that is commonly exploited by constructive and enumerative methods for the coloring problem is the fact that one can have solutions with the number of colors ranging from the chromatic number to the cardinality of the vertex set. However, it is not true that one can construct a -coloring with colors for every integer ranging from the chromatic number to the -chromatic number Barth et al. (2007). Additionally, notice that a proper graph coloring which is not a -coloring can be trivially improved by the removal of a color, namely, one that does not have a -vertex. Therefore, when one is trying to minimize the number of colors, -colorings appear naturally as otherwise the available coloring could be easily improved. On the other hand, when one is trying to maximize the number of colors, it is a challenging task to increase the number of colors while ensuring that -vertices are generated for every new color. This suggests that the search for good quality solutions for the -coloring problem should explore the structure of feasible solutions in a different manner.
CaLiMa15 presented a motivation for solving the -coloring problem, namely, finding an upper bound for the -algorithm which is a heuristic approach for the coloring problem. The -algorithm works as follows, it begins with a greedy coloring and afterwards tries to reduce the number of used colors by changing the colors of certain vertices. In this context, a -vertex represents a vertex that cannot have its color changed and thus forbids further improvements by the -algorithm. Hence, the -chromatic number represents the worst case of the -algorithm.
Let be the open neighborhood (or simply neighborhood) of in , be the closed neighborhood of in , be the anti-neighborhood of in , and be the closed anti-neighborhood of in . Define to be the set of colors adjacent to , which we denote the color neighborhood of . Also, let be the color closed neighborhood, and be the color anti-neighborhood of . Denote the degree of by , which is the size of its neighborhood . Considering to be the maximum degree of a graph, we write whenever is clear from the context. The neighborhood of a -vertex can contain at most colors. Define the color degree of by , which is the size of its color neighborhood . Consider a sorting of the vertices such that . The invariant provides an upper bound for the -chromatic number of . Let be the subset of vertices with degree at least , i.e. for each , . Denote as the set of colors that were attributed to some vertex in in a given coloring , i.e. if there is a vertex such that .
1.2 Literature review
The concept of -coloring appeared in different applications. GacEglLebEmp08,GacEglLebEmp09 applied -coloring to improve postal mail sorting systems, which are based on efficient optical recognition of the addresses on envelopes. The authors presented a new approach for address block localization, which is a very important step on the recognition of the addresses. Their approach uses
-coloring to train a classifier in the identification of the address block, and according to the authors a rate of 98% good locations on a set of 750 envelope images was obtained. ElgDesHacDusKhe06 proposed a new clustering approach based on-coloring of graphs. The presented cluster validation algorithm evaluates the quality of clusters based on the -vertex property. The authors take on this clustering technique to detect a new typology of hospital stays in the French healthcare system.
Several authors studied properties of -coloring for special classes of graphs. KraTuzVoi02 have shown that deciding the -chromatic number is NP-Complete even for bipartite graphs. A graph is -tight if it has exactly vertices with degree exactly equal to . In this regard, HavSalSam12 proved that deciding if is NP-Complete for tight chordal graphs, while showing that the -chromatic number of a split graph can be obtained in polynomial time.
Primal bound results were introduced by IrvMan99. We can assume that the chromatic number is a lower bound, as every -coloring is also a proper coloring. The upper bound is , on account of the additional color being the color of a -vertex itself. This upper bound can be narrowed, since for a -coloring we need a sufficient amount of vertices of high degree. Naturally, for a -coloring with colors, at least vertices with minimum degree are necessary. As a consequence, is a reduced upper bound for the problem. A variety of bounds on the -chromatic number were also presented in AlkKoh11, BalRaj13, KouMah02.
Regular graphs belong to a special class of graphs such that , one of the main reasons why they attract significant study. KraTuzVoi02 have shown that for every -regular graph with at least vertices , establishing that there is only a limited number of -regular graphs for which . Later, CabJak11 proved that for every -regular graph with at least vertices . A detailed review of the literature related to the -chromatic number can be found in JakPet18.
The -coloring problem for more general graphs was considered in several works. CorValVer05 introduced an approximation approach for the -chromatic number. They have shown that the -chromatic number cannot be approximated within a factor of for any constant , unless P = NP. GalKat13 settled negatively the question about the existence of a constant-factor approximation algorithm for the -chromatic number, proving that for graphs with vertices, there is no , for which the problem can be approximated within a factor , unless P = NP.
Despite the fact that the -coloring problem has received a lot of attention from the graph theory community, just a few authors considered optimization approaches such as metaheuristics or integer programming. To the best of our knowledge, FisPetMerCre15 were the first authors to propose a metaheuristic algorithm for the
-coloring problem. They proposed an hybrid evolutionary algorithm and tested its performance on a set of small instances composed of-regular graphs. For the tested -regular instances, the metaheuristic obtained the optimal solutions, which were attested using a brute force method. Encouraged by those results, the authors also considered larger benchmark instances from the second DIMACS implementation challenge Johnson and Trick (1996). As far as our knowledge goes, the only metaheuristic for the -coloring problem is the one presented in FisPetMerCre15, contrasting with the classical graph coloring problem, as the latter has a diversity of heuristic methods proposed in the literature Avanthay et al. (2003); Blöchliger and Zufferey (2008); de Werra (1990); Lü and Hao (2010); Mabrouk et al. (2009).
KocPet15 introduced an integer linear programming formulation for the-chromatic index , the edge version of the problem. The authors also provide bounds and general results for a diversity of direct products of graphs regarding the -chromatic index. KocMar18 proposed an integer programming approach for the decision version of the -coloring problem, which consists in determining whether a graph admits a -coloring with a given number of colors. The authors also performed a polyhedral study of the proposed formulation, presented valid inequalities and implemented a branch-and-cut algorithm. Computational experiments were performed testing whether for the input graphs.
1.3 Main contributions and organization
The main contributions of this paper are an integer programming formulation for the -coloring problem, a very effective multi-start multi-greedy randomized metaheuristic which attempts to explore the problem structure in the search for good quality solutions, and a matheuristic approach obtained by combining the proposed multi-start metaheuristic with a fix-and-optimize local search based on the introduced integer programming formulation. To the best of our knowledge, this paper presents the first matheuristic for the -coloring problem, and the first integer programming formulation which can be directly applied to its optimization version. Furthermore, we present a benchmark set consisting of newly created instances as well as available ones for coloring and maximum clique problems. The computational experiments show that the newly proposed approaches are very effective, reaching and proving optimality for several of the tested instances. Furthermore, the approaches are able to outperform a state-of-the-art metaheuristic Fister et al. (2015) for the -coloring problem when taking into consideration all nine large instances considered by those authors.
The remainder of the paper is organized as follows. Section 2 introduces an integer programming formulation for the -coloring problem. Section 3 describes the multi-greedy randomized heuristic. Section 4 presents the multi-start multi-greedy randomized metaheuristic, the MIP (mixed-integer programming) based fix-and-optimize local search procedure using the proposed integer programming formulation, and the matheuristic approach which is obtained by combining the first two. Section 5 summarizes the computational experiments. Final considerations are discussed in Section 6.
2 Integer programming formulation
We now describe a formulation by representatives Campêlo et al. (2004) for the
-coloring problem. Consider the binary variableto be equal to one if vertex represents the color of vertex
and to be zero otherwise, defined for every ordered pair, with and . In the proposed formulation, a vertex can only represent the color of another vertex if , which means that is the representative and also a -vertex of that color. Note that a color may have several -vertices, but only one of them will be the representative.
Define the set of vertices in the anti-neighborhood of which are not adjacent to other vertices in this anti-neighborhood as . Additionally, consider the complement of as . The -coloring problem can be cast as the following linear integer program:
The objective function (1) maximizes the number of representative vertices, which are the -vertices. Constraints (2) ensure that every vertex must have a color. Constraints (3) force the coloring to be proper. Constraints (4) guarantee that a vertex can only give a color if it is a representative (notice that if this constraint is removed, a vertex that has a stable set as anti-neighborhood is allowed to represent all its neighborhood without being a representative). Constraints (5) are the -coloring restrictions which imply that if both and are -vertices, then there must be a neighbor of which is represented by . This is achieved due to the fact that if both and are representatives, the right-hand side is equal to one, implying that the summation in the left-hand side, which is composed by the neighbors of that can be represented by , should be at least one. Constraints (6) ensure the integrality requirements on the variables.
Let be any valid lower bound for the optimal value of formulation (1)-(6), i.e. . Let be the set of vertices with degree strictly smaller than , i.e., for every . Therefore, one can set to zero variables corresponding to vertices without losing optimality, as the vertices in can never be -vertices in a -coloring with at least colors. Furthermore, variables would also be set to zero for every pair such that and .
Let be an integer feasible solution for (1)-(6) with objective value . In any solution which strictly improves , every vertex which is determined to be a representative must have degree at least , i.e. . Let be the set of vertices with degree strictly smaller than , i.e., for every . Therefore, in order to obtain a solution which strictly improves , one can set to zero variables corresponding to vertices without losing optimality in case such improving solution exists. Similarly to Observation 1, variables would also be set to zero for every pair such that and .
3 Multi-greedy randomized heuristic
In this section, we present a multi-greedy randomized constructive heuristic for the -coloring problem. The heuristic follows a two-phase framework similar to the one of ElgDesHacDusKhe06. In the first phase, an initial proper coloring, not necessarily a -coloring, is generated. The second phase ensures a proper -coloring is obtained starting from the coloring achieved in the first phase. In the remainder of this section, after presenting the pseudo-code of the two-phase framework, we describe the details of the first phase in Subsection 3.1 and of the second phase in Subsection 3.2.
The multi-greedy randomized constructive heuristic runs in (as it will be shown in Corollary 1), and is described in Algorithm 1. It takes as inputs the graph and two parameters regarding the sizes of restricted candidate lists (RCL) which will be defined later in this section, namely and . The algorithm returns a proper coloring and a set of colors . The heuristic uses the following structures:
: structure that represents the coloring which assigns a color to each vertex ;
: structure that represents the color neighborhoods of vertices in coloring ;
: set of colors used in coloring ;
: set of colors in that have -vertices.
The first phase of the approach is invoked in procedure INITIAL-COLORING (line 1), which will be detailed in Section 3.1, to obtain an initial proper coloring employing available colors. Observe that the upper bound was used instead of with the intention of not being too restrictive and give more flexibility for the heuristic to use colors that will be removed later in the second phase of the framework. The structures , , , and are determined by this call to INITIAL-COLORING.
As it was already mentioned, procedure INITIAL-COLORING does not ensure a -coloring, as some colors in might not have a -vertex. In order to obtain a feasible -coloring, the second phase is invoked in procedure FIND-B-COLORING (line 1), which will be detailed in Section 3.2, in order to remove colors from until a -coloring is achieved. The updated structures and are returned at the end of the execution of FIND-B-COLORING. RANDOMIZED-CONSTRUCTIVE thus returns the obtained -coloring as well as the set of used colors (line 1).
3.1 First phase: obtaining an initial coloring
An initial coloring is obtained using procedure INITIAL-COLORING, which is detailed in Algorithm 2. In addition to the graph , the algorithm also takes as input a parameter related to the size of restricted candidate lists. The structures , , , and will be returned at the end of its execution. We remark that, for ease of explanation, the pseudo-code which will be presented assumes the graph is connected. An easy way to overcome this fact will be given once the algorithm is described. The following structures are used by the algorithm:
: initial set of available colors;
: stores the set of vertices which had already been colored;
: keeps the vertices in which have no attributed color;
: set of colors in that were attributed to some vertex with degree at least .
INITIAL-COLORING uses the following auxiliary method:
HEURISTIC-COLOR-VERTEX: described in Algorithm 3, the procedure takes as inputs the graph , vertices and , as well as structures , , . The method returns a color to be attributed to vertex . Firstly, structure is initialized as empty (line 3), and will store the set of candidate colors for coloring . The algorithm then checks if has degree greater than or equal to (line 3) and tries initially to build with the set of colors not belonging to the color neighborhood of neither nor , and have not been assigned to a vertex with degree greater than or equal to , i.e., , and (line 3). The purpose behind this coloring idea is to diversify the colors assigned to both neighborhoods of and , while trying to give different colors to vertices with high enough degrees to become
-vertices, in an attempt to increase the probability of finding-vertices that represent the greater amount of color classes. If is still empty (line 3), the algorithm tries to include in colors not belonging to the color neighborhood of neither nor , i.e., and (line 3). If no such color exists, i.e., remains with no elements in line 3, is built in line 3 with colors not belonging to the color neighborhood of , i.e., . This guarantees at least one color in since the algorithm initially works with available colors and . The color in with lowest index is returned in line 3.
Algorithm 2 first initializes the used structures as follows. For each vertex , the color neighborhood of is initialized as empty and is set to 0, which implies that no color is assigned to (line 2). The sets , and are initialized as empty (line 2). Next, the algorithm sets as the maximum degree vertex in in line 2, where ties are broken arbitrarily. The set is initialized with colors in line 2, followed by the coloring of with color 1 in line 2. The structures , and are updated in line 2. The neighborhood of vertices in are yet to be explored, and the set is initialized with in line 2. The algorithm then performs a series of iterations to assign colors to the vertices in while the set is not empty in lines 2-2. Elements from a restricted candidate list (RCL) containing the best elements in are randomly chosen along the construction of the solution. Given the vertices in the greedy choice criterion for RCL is:
maximization of the vertex degree: .
RCL is defined as a subset of containing all candidates whose evaluation for the greedy criterion lies in an interval of values defined by a parameter . Define , thus this interval is given by . RCL is created in line 2. A vertex is randomly selected from RCL in line 2. Set is built in line 2 with the vertices in that have no assigned color. Similar to RCL, elements from a restricted candidate list containing the best elements in are randomly chosen along the construction of the solution. Given the vertices in , the greedy choice criterion for RCL is:
maximization of the vertex degree: .
RCL is defined for as RCL was defined for . RCL is created in line 2. A vertex is randomly selected from RCL in line 2 and receives a color determined by procedure HEURISTIC-COLOR-VERTEX in line 2. The structures , and are updated in line 2. The neighborhood of is yet to be explored, so the vertex is inserted into in line 2. Vertex is then removed from in line 2 and a new iteration resumes, until set becomes empty. Vertex is then withdrawn from in line 2. After all vertices have been colored, i.e., is empty, the algorithm updates the list of colors having -vertices in line 2. The structures , , and are then returned in line 2. Note that, for ease of explanation, the described pseudo-code assumes the graph is connected. However, in the case of a disconnected graph, this can be overcome by simply inserting into the uncolored vertex of highest degree (if there is at least one uncolored vertex) as a last step in the loop of lines 2-2 whenever becomes empty.
Algorithm 2 runs in .
Consider and to be ordered lists containing vertices sorted in nonincreasing order of vertex degree, which means that every element entering these lists should be inserted into the correct ordered position. Additionally, assume for each , and to be represented as
-dimensional binary vectors, with each elementrepresenting whether color belongs to the corresponding set or not. Firstly, consider the running time to perform a single update of the structures. Note that there are updates of structures and each of them can be done in . The updates of and can all be done in . Thus, a single update of all the required structures can be done in . HEURISTIC-COLOR-VERTEX runs in , which is implied by the construction of and the selection of its minimum value. In Algorithm 2, the instructions of lines 2-2 run in . Line 2 can be done in . In order to determine the complexity of the while loop in lines 2-2, we perform an aggregated analysis. Note that each vertex is inserted into and removed from at most once and each insertion into this ordered list can be performed in , implying for all the insertions. As is kept as an ordered list, whenever a vertex is to be removed from , line 2 is carried out in
. At the moment a vertex enterslines 2-2 are executed in . We ommit the entrance of vertices in from the analysis as they are directly related to their entrance in , i.e., whenever a vertex enters in line 2 it will be removed from in line 2 just after its entrance in . Therefore, the overall running time of Algorithm 2 is which is . ∎
3.2 Second phase: transforming the initial coloring into a -coloring
A feasible -coloring is obtained using procedure FIND-B-COLORING, which is detailed in Algorithm 4. In addition to the graph and RCL size parameters and , the algorithm also takes as inputs the sets , , , and the coloring . Remark that the inputs and will be updated by the algorithm and will be returned at the end of its execution. The following structure is used:
: set of colors that do not have -vertices.
FIND-B-COLORING, which is a modification of the -algorithm mentioned in the introduction, consists in iteratively eliminating colors from the graph by recoloring vertices colored with colors in . The set is initialized with every color in in line 4. The algorithm then performs a series of iterations while is not empty (lines 4-4). Elements from a restricted candidate list containing the best elements in are randomly chosen along the construction of the solution. Given the colors in the greedy choice criterion for RCL is:
maximization of the color index: ;
Criterion aims to remove colors with higher index since after the execution of Algorithm 2, colors with smaller index are presumably closer to have a -vertex. RCL is defined as a subset of containing its best candidates. RCL is created in line 4. A color is randomly selected from RCL in line 4. For each vertex colored with , i.e., , a new color is assigned to (lines 4-4). Note that any color in is avaiable to color . Elements from a restricted candidate list containing the best elements in are randomly chosen along the construction of the solution. Before explaining the greedy criterion, let be the number of vertices adjacent to such that color is also not in their color neighborhood. Additionally, let be the set of colors not adjacent to neither nor . Define as the minimum cardinality set among all for . Note that is the set of colors not adjacent to the vertex with the minimum number of missing colors in its color neighborhood. Given the colors in the greedy choice criteria for RCL are:
maximization of vertices with a new color added to their color neighborhood: ;
minimization of the color index considering the colors in : .
Criterion intends to increase the color neighborhood of as many vertices as possible, whereas aims to predict the vertex which is the closest to become a -vertex. Given , RCL is defined as a subset of containing all candidates whose evalutation of the greedy criterion lie in an interval of values defined by a parameter . Define , thus this interval is given by . As for , RCL is defined as a subset of containing its best candidates.
RCL is created in line 4. Any of the greedy functions or can be chosen for the construction of RCL and they are selected at random with 50% chance each. Note that, as stated previously on the definition of and , the selection of the one to be used will define if RCL uses or . Vertex receives a color randomly selected from RCL in line 4. Color neighborhood of vertices in are then updated in line 4. After all vertices previously colored with have been assigned a new color, is removed from set in line 4.
The algorithm then certifies if colors in now have a -vertex in lines 4-4. Colors that now have a -vertex are removed from in line 4. Lastly, is removed from in line 4. The algorithm terminates when which implies , so the resulting -coloring and the set of used colors are returned in line 4.
Algorithm 4 runs in .
Observe that Algorithm 4 performs a series of color removals and updates. The while loop of lines 4-4 is executed times, as each color is removed at most once. On any occasion a color is to be removed from , line 4 is carried out in . The foreach loop of lines 4-4 is executed times and each iteration is performed in , therefore the complete loop is executed in . The foreach loop of lines 4-4 is also executed times and the verification and possible updates are all performed in for each iteration, consequently the complete loop is executed in . Note that in order to perform the verification of line 4 in , one could keep for each an indicator vector corresponding to together with the number of nonzero entries in this vector, as well as an indicator vector corresponding to in conjunction with the number of nonzero entries in this vector. The verification could thus be performed by simply comparing the number of nonzero entries in these two indicator vectors. Algorithm 4 thus runs in , which is . ∎
Algorithm 1 runs in .
4 The matheuristic approach
In this section, before describing the matheuristic approach, we present its two main components: (a) the multi-start multi-greedy randomized metaheuristic and (b) the MIP (mixed integer programming) based fix-and-optimize local search procedure. The multi-start metaheuristic consists in performing a predefined number of iterations of the multi-greedy randomized heuristic and is described in Subsection 4.1. The MIP-based fix-and-optimize local search consists in solving a restricted MIP obtained by fixing certain decision variables and is described in Subsection 4.2. Finally, Subsection 4.3 presents the matheuristic approach which consists in the combination of the multi-start metaheuristic with the MIP-based fix-and-optimize local search procedure.
4.1 Multi-start multi-greedy randomized metaheuristic
The pseudo-code of the multi-start multi-greedy randomized metaheuristic is described in Algorithm 5. In addition to the graph and two parameters regarding the sizes of restricted candidate lists (RCL), namely and , the algorithm also takes as input , which represents the maximum number of iterations that the multi-greedy randomized heuristic will be executed.
Procedure MULTISTART-B-COL will save in the best obtained coloring. The set of used colors in , , is initialized as empty (Algorithm 5, line 5). The coloring generated at iteration is represented by and the corresponding set of used colors as . represents the solution value, which is the number of used colors in coloring . The loop in lines 5–5 performs iterations . The construction phase starts by invoking procedure RANDOMIZED-CONSTRUCTIVE to build the solution in line 5. In case an improving solution is obtained, the algorithm updates and in lines 5-5. If the solution value of matches the upper bound the execution of RANDOMIZED-CONSTRUCTIVE is terminated by returning in line 5, as the solution is proven to be optimal. Otherwise, a new iteration begins until the maximum number of iterations is exceeded. The solution with the highest number of used colors, i.e., the best solution encountered by the multi-start phase, is returned in line 5.
4.2 MIP-based fix-and-optimize local search
Given an available feasible solution, the MIP-based fix-and-optimize local search procedure consists in generating a subproblem obtained from the original -coloring problem by fixing certain decision variables at the values they assume in the available feasible solution which is also offered as a warm start for the used MIP solver. With fewer variables remaining to be optimized, it is expected that the resulting subproblem is more tractable by a standard MIP solver than the original problem. In this work, the input feasible solution consists of the best solution generated by MULTISTART-B-COL. The MIP-based fix-and-optimize local search is described in Algorithm 6. In addition to the graph and an initial feasible solution, represented by and , the algorithm also takes as input the maximum time allowed for solving the obtained MIP formulation given by MAXTIME. In our framework, the initial feasible solution offered to MIP-LS will be the currently best known solution returned by MULTISTART-B-COL.
The set of representative -vertices and the set of vertices that cannot be representatives in an improving solution are initialized as empty in line 6 of Algorithm 6. Set is built according to the input solution in the foreach loop of lines 6-6. Following Observations 1 and 2, set is built from the input solution in the foreach loop of lines 6-6 with all vertices which are not -vertices in coloring and have degree strictly smaller than its number of colors . Line 6 solves a mixed integer program defined by the formulation presented in Section 2, in which all variables in are fixed to one (i.e., all corresponding vertices are selected to be representatives in the solution) and all vertices in are fixed to zero. Additionally, coloring is provided as a warm start, i.e., as an initial feasible solution. Fixing is achieved by adding the following additional constraints to the formulation
The best solution obtained by the resulting MIP restricted to a maximum time limit MAXTIME is returned in line 6. Note that the input coloring is always feasible for this MIP subproblem.
4.3 Matheuristic approach
Combinations of metaheuristics with exact algorithms from mathematical programming approaches such as mixed integer programming (MIP), called matheuristics, have received considerable attention over the last few years. It has been acknowledged by the optimization research community that combining effort from exact and metaheuristic approaches could achieve better solutions when compared with pure classic methods Raidl and Puchinger (2008); Dumitrescu and Stützle (2009). Matheuristics frequently benefit from metaheuristics as the main method to compute good quality solutions, with the exact approach used to enhance these solutions by solving subproblems.
Motivated by recently successful results by matheuristics Doi et al. (2018); Cunha et al. (2019); Perumal et al. (2019); Melo et al. (2021), we combine the multi-start metaheuristic MULTISTART-B-COL
that appears in Algorithm 5
with the MIP-based fix-and-optimize local search procedure presented in Algorithm 6, which produces the matheuristic MSBCOL:
Step 1: , MULTISTART-B-COL();
Step 2: , MIP-LS(, , , MAXTIME);
Step 3: Return , .
5 Computational experiments
All computational experiments were carried out on a machine running under Ubuntu x86-64 GNU/Linux, with an Intel Core i7-8700 Hexa-Core 3.20GHz processor and 16Gb of RAM. The metaheuristic was coded in C++ and the formulation solved using CPLEX 12.8 under standard configurations. Each execution of the solver was limited to one hour (3,600s). Subsection 5.1 describes the benchmark instances. Subsection 5.2 lists the tested approaches and reports the parameter settings. Subsections 5.3 and 5.4 summarize the computational results for small and large instances, correspondingly. Finally, Subsection 5.5 compares some of the obtained computational results with a state-of-the-art metaheuristic presented in FisPetMerCre15 taking into consideration a subset of the large instances.
5.1 Benchmark instances
The tests were carried out on a set of benchmark instances divided into small ( 10,000 edges) and large ( 10,000 edges) graphs, and is composed of:
new randomly generated instances;
instances from the Second DIMACS Implementation Challenge.
The new set of instances was constructed using the graph generator ggen Morgenstern and includes bipartite, geometric and random graphs. Small instances were created with the following parameters: (a) vertices; (b) edge probability for random and bipartite graphs and the euclidean distance for geometric graphs lie in . Five instances were generated for each combination of number of vertices and edge probability (or euclidean distance for the geometric graphs), therefore instances with those same characteristics, but different seeds, are organized into instance groups. Each instance group is identified by , where represents the class of the graph: random (), bipartite (), and geometric (); gives the number of vertices and denotes the edge probability for random and bipartite graphs, and the euclidean distance for geometric graphs. More challenging large bipartite and random instances were also created in a similar fashion but with the number of vertices in . We remark that all results reported for this set of instances represent average values over the corresponding instance group.
We also use the graphs presented in the benchmark instances from the Second DIMACS Implementation Challenge as they are largely used in the literature, especially for coloring and maximum clique problems Avanthay et al. (2003); Lü and Hao (2010); Moalic and Gondran (2018); Nogueira et al. (2018); San Segundo et al. (2019). The instances are identified by their original filename and can be obtained in the DIMACS Implementation Challenges website Trick et al. (2015). We denote the instances for coloring problems as graph coloring instances and those for maximum clique problems as maximum clique instances. The complete benchmark instances along with detailed results for each instance are available in MelQueSan20 at Mendeley Data.
5.2 Tested approaches and parameters setting
In this subsection we present the tested approaches and the preliminary experiments carried out to determine the parameters of the proposed techniques. The following approaches were considered in the computational experiments:
MSBCOL: run exclusively MULTISTART-B-COL in parallel using all cores of the target machine;
MSBCOL: run the matheuristic, using the best solution encountered by the metaheuristic MSBCOL as a warm start for the MIP-based fix-and-optimize local search procedure;
MSBCOL: run the complete integer programming formulation presented in Section 2, using the best solution encountered by the metaheuristic MSBCOL as a warm start. Following Observations 1 and 2, variables corresponding to vertices with degree less than or equal to this best solution value are fixed to zero, as long as they are not -vertices in the warm start solution;
IP: Run the integer programming formulation presented in Section 2 without any initial solution or fixings of variables.
The used test strategy was adopted to evaluate the behavior of the newly proposed methods according with the class and size of the benchmark instances. Furthermore, we wanted to verify the effectiveness of the MIP-based fix-and-optimize local search when compared with the complete formulation.
Define to be the density of , calculated as , and let the maximum number of iterations for MULTISTART-B-COL, , be computed as . This formula for can be interpreted as follows. The minimum number of iterations that the algorithm executes is given by the first part of the formula, which is the constant 100. The variable number of iterations given by is inversely proportional to the size and density of the graph, as iterations become more time consuming on larger and denser graphs. Such choice was made as an attempt to allow a reasonable number of iterations in order to avoid poor performance of the algorithm. The experiments to tune the parameter values are reported in the following. We randomly selected a small subset containing approximately 5.0% of the instances with varying characteristics for parameter tuning. The following values were tested for each parameter:
The best obtained parameter values for MULTISTART-B-COL were and .
5.3 Small instances
Tables 1-3 report the results for MSBCOL, MSBCOL, MSBCOL and IP on the new set of generated small instances composed of bipartite, geometric and random graphs. The first column identifies the instance group. Columns 2 to 4 report the number of vertices (), the average number of edges () along with the average solution upper bound for the instance group (). Columns 5 to 8 give, for MSBCOL, the best encountered solution values (), the average solution values for the executed number of iterations (), the average running times in seconds (), and the percentual gap between the solution found by MSBCOL and the best obtained solution (), calculated as (). Columns 9 and 10 give, for MSBCOL, the encountered solution values () and the average running times in seconds () for the MIP-based local search procedure. Columns 10 to 15 give, for the exact approaches MSBCOL and IP, the encountered solution values ( and , respectively), the average running times to solve the instances to optimality (), and the average open gaps (in %) of the unsolved instances (), calculated as , where represents the best known integer solution and the best upper bound achieved at the end of the execution. The last two lines report the number of best known solutions found by each of the proposed approaches (), and, for MSBCOL and IP, the amount of instances solved to optimality ().
The value ’n/a’ in a cell indicates that, for at least one instance in the group, either the solver exceeded the time limit before obtaining a feasible solution or the execution was halted by the operating system due to memory limitations. The value ’t.l.’ for column means that none of the instances in the group were solved to optimality within the time limit of 3,600 seconds using the corresponding integer program. The value ’-’ for column represents that all five instances in the group were solved to optimality. The best encountered solution values are shown in bold.