1 Introduction
In social network analysis, detecting a large cohesive subgraph is a fundamental and extensively studied topic with various applications. Clique is a classical and ideal model in the field of cohesive subgraph detection. A graph is a clique if there is an edge between any pair of vertices. The Maximum Clique Problem, that is, to find a clique of maximum size in a given graph, is a fundamental problem in graph theory and finds wide application in many fields, such as biochemistry and genomics [Butenko and Wilhelm2006], wireless network [Lakhlef2015], data mining [Boginski et al.2006, Conte et al.2018] and many others.
However, in some realworld applications, the networks of interest may be built based on empirical data with noises and faults. In these cases, large cohensive subgraphs hardly appear as ideal clique. To tackle this problem, many clique relaxation models have been proposed. In this paper, we focus on plex, a degreebased clique relaxation model. A simple undirect graph with vertices is a plex if each vertex of this graph has at least neighbors. The maximum plex problem, that is, to find a plex of maximum size on a given graph with a given integer , has received increasing attention from researchers in the fields of social network analysis and data mining [Xiao et al.2017, Conte et al.2018].
The decision version of the maximum plex problem is known to be NPcomplete [Balasundaram et al.2011]. Different algorithms have been developed for this problem, including exact algorithms and heuristic ones. balasundaram2011clique balasundaram2011clique proposed a branchandbound algorithm based on a polyhedral study of this problem. mcclosky2012combinatorial mcclosky2012combinatorial developed two branchandbound algorithms adapted from combinatorial clique algorithms. Recently, xiao2017fast xiao2017fast proposed an exact algorithm which breaks the trivial exponential bound of for maximum plex problem with . gao2018exact gao2018exact proposed several graph reduction methods integrated them into a brandandbound algorithm.
Due to the exponential time complexity of the maximum plex problem, several heuristic approaches have been proposed to provide a satisfactory solution within an acceptable time. DBLP:series/natosec/GujjulaSM14 DBLP:series/natosec/GujjulaSM14 proposed a hybrid metaheuristic based on the GRASP method. miao2017approaches miao2017approaches improved the construction procedure to provide a better initial solution for GRASP method. zhou2017frequency zhou2017frequency developed a tabu search algorithm named FDTS which achieved stateoftheart performance.
Local search is likened to “trying to find the top of Mount Everest in a thick fog while suffering from amnesia” [Russell and Norvig2016]. For a long time, much effort has been devoted to enable a memory mechanism for local search. These works can be roughly divided into two parts. The first part focuses on exploiting the searching history to guide the search into a more promising area. For example, boyan2000learning boyan2000learning proposed the STAGE algorithm to learn an evaluation function from features of visited states which can be used to bias future search trajectory. [Zhou et al.2018]
presented a probability learning based local search algorithm for the graph coloring problem. The other part focuses on reducing the inherent cycling problem of local search. Tabu mechanism
[Glover and Laguna1998] maintains a shortterm memory of the recent search steps to forbid reversing the recent changes. Configuration Checking strategy [Cai et al.2011] keeps a memory of state change of local structures and reduces cycling problem by prohibiting cycling locally.When getting stuck in local optima, a good perturbation mechanism can modify the candidate solution and generates a promising search area for the following search steps. Inspired by the multiarmed bandit problem and its algorithms, we propose the bandit learning based perturbation mechanism (BLP), which can learn in an online way to select a good vertex for perturbation. To our best of knowledge, this is the first attempt to combine reinforcement learning and local search for the maximum plex problem .
Recently, Configuration Checking (CC) and its variants have been successfully applied in various combinatorial optimization problems [Cai et al.2011, Wang et al.2016, Wang et al.2018], revealing the importance of exploiting the structural property of the problems. Different from tabu mechanism, CC is a nonparameter strategy which exploits the circumstance information to reduce cycling problem in local search. However, CC and its variants have the following limitations. Firstly, the use of the configuration information is limited to handling the cycling problem. Secondly, the forbidding strength of the CC and its variants is static and cannot make adjustments to different problem instances. In this paper, we propose a variant of CC, named Dynamicthreshold Configuration Checking (DTCC), to extend the original CC from two different aspects. One is the neighbor quality heuristic which evaluates a vertex with consideration of the community it belongs to. The other is the dynamic threshold mechanism which enables an adaptive forbidding strength for CC.
Based on BLP and DTCC, we develop a local search algorithm, called BDCC, and improve it by a hyperheuristic strategy. The resulting algorithm, named BDCCH, can learn to select a good heuristic in adding and swapping phase. The experiments show that our algorithms dominate FDTS on the standard DIMACS and BHOSLIB benchmarks. Not only is our algorithms robust and timeefficient, but also they provide better lower bounds on the size of the maximum plexes for most hard instances. Besides, our algorithms achieve stateoftheart performance on massive graphs.
The remainder of this paper is organized as follows. Section 2 gives some necessary background knowledge. Section 3 provides some formal definitions and proposes the BLP mechanism. Section 4 proposes DTCC strategy to extend CC from two aspects. In Section 5, we present the BDCC algorithm and improve it by a hyperheuristic strategy. Section 6 shows the experimental results. Section 7 gives concluding remarks.
2 Preliminaries
2.1 Basic Definitions and Notations
An undirected graph is defined as , where is a set of vertices and is a set of edges. Each edge consists of two vertices, denoted as , where and are the of this edge. Two vertices are neighbors if they belong to an edge. Let denote the set of all neighbors of . The degree of vertex is defined as the . For a vertex set , be the set of neighbors of and be the induced graph of .
Given a graph and an integer , a subset is a plex, if for all . A vertex is a saturated vertex if . The saturated set of set is the set of all saturated vertices in . A vertex is deficient vertex if . Obviously, any subset containing deficient vertices cannot be a plex.
A candidate solution is a subset of . Given a graph , an integer and a feasible plex , a typical 3phase local search algorithm for the maximum plex problem maintains a feasible plex as candidate solution and uses three operators, , and to modify it iteratively [Zhou and Hao2017]. The set is split into three disjoint sets, , and , which contain the objects of the abovemetioned operators. Here we give their formal definitions.
Obviously, the vertices in can be added into directly. The vertices in can be added into while removing one vertex in . Adding a vertex into would cause two or more deficient vertices in . Therefore, these vertices should be removed to maintain a feasible plex.
2.2 Multiarmed Bandit Problem
Multiarmed Bandit Problem
(MAB) is a onestate RL problem. In this problem, there is a set of arms and an agent which repeatedly selects an arm to play at each step with a purpose to maximizing the longterm expected reward. Since the distribution of the reward of each arm is unknown, the agent faces an explorationexploitation tradeoff. On the one hand, it needs to explore by selecting each arm to estimate the expected reward of them. On the other hand, it needs to exploit the existing knowledge by choosing arms with high expected rewards. The explorationexploitation tradeoff is a fundamental issue in reinforcement learning and the
greedy strategy is widely used to keep a balance of them. With the greedy strategy, the agent chooses actions randomly for exploration with a probability and makes choices greedily for exploitation with a probability .2.3 Configuration Checking
Configuration Checking (CC) [Cai et al.2011], is a parameterfree strategy that can exploit the structural property of the problem to reduce cycling problem in local search. The configuration of a vertex is defined as the states of its neighbors. The main idea of CC strategy is that if the configuration of a vertex remains unchanged since its last removal from candidate solution, then it is forbidden to be added back into the candidate solution.
Recently, different CC variants have been proposed and successfully applied to various combinatorial optimization problems [Cai et al.2011, Wang et al.2017a, Wang et al.2017b]. Here we highlight the Strong CC (SCC) strategy which was proposed in [Wang et al.2016] for the Maximum Weight Clique Problem. The difference between SCC and CC is that SCC allows a vertex to be added into candidate solution only when some of ’s neighbors have been added since ’s last removal, while CC allows the adding of a vertex when some of ’s neighbors have been either added or removed.
Due to the similarity of clique and plex, it is natural to think of applying SCC to local search for the maximum plex problem. A straighforward SCC strategy for maximum plex problem can be implemented as follows. We maintain a Boolean array to indicate whether the configuration of each vertex has been changed. Only when a vertex satisfies the SCC condition can it be added in . Initially we set for all . When a vertex is added into candidate solution , for all , is set to . When a vertex is removed from , is set to . As for a swap step, where vertex is added into at the cost of removal of vertex , is set to .
3 Learning from History: BLP Mechanism
According to the definition in Subsection 2.1, as grows, and become smaller because the vertices in them need to satisfy more constraints. Therefore, usually contains most of the vertices in when reaching a local optima. It is difficult to select a good vertex for perturbation from such a large set. We propose bandit learning based perturbation mechanism (BLP) to learn from searching history to select a good vertex for perturbation in an online way. In this section, we give some necessary formal definitions and present the BLP mechanism.
Definition 1.
Given a graph and a plex , an action is a pair where is the operator and is the object. The available action set is defined as . Let denote that applying action to results in a new plex .
Definition 2.
A search trajectory is a finite sequence of plexes such that for , .
Definition 3.
The walk of a search trajectory is an ordered action sequence where and for .
Definition 4.
Given a search trajectory , a plex in is a breakthrough point if for , and an episode is a subsequence of between two adjacent breakthrough points.
The underlying consideration of BLP is that all the Perturb actions in the walk of an episode make contributions to the quality improvement at the end of this episode. The BLP treat each vertex as an arm in MAB and reward them according to their contribution when an episode is completed. Therefore, the expected reward of a vertex can reflect the possibility of reaching another breakthrough point if perturbing the candidate solution with this vertex. In the implementation, the BLP maintains a value for each vertex, initialized to 0 at the start of the search. When an episode is completed, we reward the objects of all the Perturb actions in the corresponding walk. For a vertex to reward, we update with exponential recency weighted average (ERWA) technique [Sutton and Barto1998], as is shown in Equation 1.
(1) 
Here the is a factor called stepsize to determines the weight given to the recent reward, and where is the number of Perturb actions in the walk of this episode. The intuition behind the reciprocal reward value is that the actions applied in a shorter episode are more valuable than those in a longer one. In the perturbation phase, BLP selects a vertex with greedy strategy.
4 Dynamicthreshold Configuration Checking
According to our previous experiments, applying CC or other CC variants directly to the maximum plex problem does not lead to a good performance on graphs with high edge density. The reason is that the configurations of the highdegree vertices in these graphs are very likely to change and CC (or other CC variants) cannot enhance its forbidding strength on these vertices. To make better use of the configuration information and enable an adaptive forbidding strength, we propose a new variant of CC named Dynamicthreshold Configuration Checking (DTCC). The two parts of DTCC are the neighbor quality heuristic and dynamic threshold mechanism.
4.1 Neighbor Quality Heuristic
The neighbor quality of a vertex , denoted by , is defined as , where (resp. ) is the total number of times a vertex in is added into (resp. removed from) the candidate solution. Due to the cohesive characteristics of plex, a vertex that belongs to a higherquality community is more likely to appear in a large plex. In the implementation, DTCC maintains a integer (initialized to ) for each vertex, and update the value with the following rule.
DTCCNQRule. The value is set to for all . When a vertex is added into candidate solution , for all , . When a vertex is removed from , for all , .
4.2 Dynamic Threshold Mechanism
We extend the to an integer array and maintain an integer array that can adjust the forbidding strength on different vertices. A vertex is allowed to be added into candidate solution only when DTCC condition is satisfied. The following four rules specify the dynamic threshold mechanism.
DTCCInitialRule. In the beginning of search process, for all , .
DTCCAddRule. When is added into candidate solution, , , and for all , .
DTCCSwapRule. When is added into candidate solution at the cost of removal of , .
DTCCPerturbRule. When is added into candidate solution and a set of vertices is removed from this candidate solution, for all , , and for all .
Note that the SCC strategy is a special case of DTCC strategy whose is a Boolean array and is fixed to . Lemma 1 illustrate their relation.
Lemma 1.
If a vertex satisfies the DTCC condition, then it satisfies the SCC condition. The reverse is not necessarily true.
Proof.
According to DTCC rules, for during the search processs. If the the DTCC condition holds, then . So at least one neighbor of must be added into candidate solution since the last time was removed. So the SCC condition is satisfied.
Suppose satisfies the SCC condition , but . In this case the DTCC condition is not satisfied. ∎
According to Lemma 1, we can conclude that DTCC has stronger forbidding strength than SCC. Generally, a frequently operated vertex has a high and is more likely to be forbidden. Thus the algorithm is forced to select other vertices to explore the search space.
5 BDCC Algorithm and A Hyperheuristic
5.1 BDCC Algorithm
Based on the BLP mechanism and DTCC strategy, we develop a local search algorithm named BDCC, whose pseudocode is shown in Algorithm 1. Initially, the best found plex, denoted as , is initialized as empty set. In each loop (line 310), an initial solution is firstly constructed (line 4) as the starting point of the search trajectory, and the search procedure starts. If the best solution in this search trajectory is better than the best solution ever found , is updated by and function (line 8) is called to reduce the graph. If the reduced graph has fewer vertices than , then is returned as one of the optimum solutions. Three major components in BDCC are initial solution construction, search procedure and graph peeling. We describe them in detail in the following.
We adopt the construction function in FDTS [Zhou and Hao2017]. The function firstly use the vertex with minimum (breaking ties randomly) in a random sample of vertices to create the singleton set , and repeatedly add the vertex with minimum (breaking ties randomly) in into until is empty. Then the final plex is returned as the initial solution. By giving priority to vertices that are operated less frequently, the construction procedure can generate diversified initial solutions in different rounds.
The function iteratively selects one action to modify the candidate solution until the iterations limit is reached, or the available action set is empty, as is shown in Algorithm 2. The records the bestquality solution in the search trajectory so far, and the record the action sequence since the last breakthrough point. The algorithm selects vertices with the highest for adding and swapping and selects vertices for perturbation according to their with greedy strategy. After each iteration, if , that means a breakthrough point is reached and the episode is completed, then the objects of all Perturb actions in current (if there exist) will be rewarded with the ERWA algorithm and will be cleared out.
If the returned by is better than , then is updated with and the function is called to recursively deletes the vertices (and their incident edges) with a degree less than until no such vertex exists. It is sound to remove these vertices since they can not be included in any feasible plexes larger than .
5.2 Improving BDCC by A Hyperheuristic
Our previous experiments show that selecting vertices from and greedily can usually lead to a highquality solution on most hard instances. However, on some problem domains where the optimum solutions are hidden by incorporating lowdegree vertices, the search may be misled by the greedy manner and miss the best solutions in some runs [Cai et al.2011]. To enhance the robustness of BDCC, we design a hyperheuristic based on simulated annealing to switch between different heuristics dynamically and select the suitable one for different problem instances. We equip BDCC with this hyperheuristic, developing an algorithm named BDCCH, as outlined in Algorithm 3.
The difference between BDCC and BDCCH is whether the heuristics for adding and swapping is fixed. The BDCCH algorithm adopts three heuristics for adding and swapping, (i), selecting vertex with largest , (ii), selecting vertex with largest , (iii), selecting vertex randomly. The BDCCH maintains a variable for each heuristic to record the size of the best solution found with . A temperature is used to control the heuristic selection. Before the search procedure begins, the algorithm selects one heuristic under current . The selection probability of each heuristic is defined based on Boltzmann distribution , which is widely used for softmax selection [Sutton and Barto1998]. A relatively high initial temperature can lead to equal selection to force exploration. As the temperature cools down, the algorithm are more inclined to select a heuristic with highest value and exploit with this heuristic.
6 Experimental Result
We evaluate our algorithms on standard DIMACS and BHOSLIB benchmarks as well as massive realworld graphs.
Instance  k=2  k=3  k=4  

FDTS  BDCC  BDCCH  FDTS  BDCC  BDCCH  FDTS  BDCC  BDCCH  
brock400_4  33(33)  33(32.88)  33(33)  236.95  36(36)  36(36)  36(36)  1.25  41(41)  41(41)  41(41)  0.25 
brock800_1  25(25)  25(25)  25(25)  1.79  30(29.92)  30(29.34)  30(29.92)  69.77  34(34)  34(33.96)  34(34)  35.45 
brock800_2  25(25)  25(25)  25(25)  1.01  30(30)  30(29.98)  30(30)  159.13  34(33.96)  34(33.36)  34(34)  
brock800_3  25(25)  25(25)  25(25)  1.59  30(30)  30(29.64)  30(30)  39.76  34(34)  34(33.6)  34(34)  28.83 
brock800_4  26(26)  26(25.78)  26(26)  4.59  29(29)  29(29)  29(29)  3.83  34(33.12)  34(33.02)  34(33.22)  
C1000.9  82(81.56)  82(81.9)  82(82)  96(95.14)  96(95.32)  96(95.22)  109(107.62)  110(108.32)  110(108.14)  
C2000.5  20(19.6)  20(19.86)  20(19.94)  23(22.14)  23(22.18)  23(22.4)  26(25.04)  26(25.06)  26(25.04)  38.25  
C2000.9  92(90.7)  94(92.44)  93(91.98)  106(105.14)  109(107.22)  108(107.02)  120(118.6)  123(121.14)  123(121.02)  
C4000.5  21(20.5)  22(21.02)  21(20.92)  24(23.38)  24(24)  24(23.9)  27(26.12)  28(27)  27(26.74)  
DSJC1000.5  18(18)  18(18)  18(18)  2.96  21(21)  21(21)  21(21)  3.64  24(23.98)  24(23.56)  24(24)  
gen400_p0.9_65  74(72.68)  74(73.22)  74(73.14)  101(100.96)  101(101)  101(101)  132(132)  132(132)  132(132)  0.02  
gen400_p0.9_75  79(78.74)  79(79)  80(79.02)  114(114)  114(114)  114(114)  0.01  136(136)  136(131.8)  136(132.22)  
keller5  31(31)  31(31)  31(31)  0.01  45(45)  45(45)  45(45)  2.26  53(53)  53(53)  53(53)  3.64 
keller6  63(63)  63(63)  63(63)  0.50  93(90.28)  93(90.06)  93(90.12)  109(106.28)  113(109.3)  117(108.9)  
MANN_a45  662(661.28)  661(661)  662(661.02)  990(990)  990(990)  990(990)  1.83  990(990)  990(990)  990(990)  1.94  
MANN_a81  2162(2161.34)  2162(2161.04)  2162(2161.08)  3240(3240)  3240(3240)  3240(3240)  101.04  3240(3240)  3240(3240)  3240(3240)  105.40  
p_hat15002  80(80)  80(80)  80(80)  0.01  93(93)  93(93)  93(93)  0.04  107(107)  107(106.94)  107(107)  59.27 
san400_0.7_2  32(31.98)  32(31.98)  32(32)  47(46.04)  47(46.04)  47(46.68)  61(61)  61(61)  61(61)  +0.02  
san400_0.7_3  27(26.4)  27(27)  27(27)  38(37.96)  39(38.04)  39(38.04)  50(49.12)  50(49.46)  50(50)  
san400_0.9_1  102(101.36)  103(102.14)  103(102.24)  150(150)  150(150)  150(150)  0.02  200(200)  200(200)  200(200)  0.02 
Instance  k=2  k=3  k=4  

FDTS  BDCC  BDCCH  FDTS  BDCC  BDCCH  FDTS  BDCC  BDCCH  
frb50231  67(66.2)  67(66.28)  67(66.14)  79(78.26)  79(78.92)  79(78.98)  92(90.3)  92(91.12)  92(91.36) 
frb50232  67(66)  66(66)  66(66)  79(78.22)  79(78.88)  79(79)  91(90.04)  91(90.88)  92(90.98) 
frb50233  65(63.96)  65(64.06)  65(64.04)  76(75.32)  76(75.98)  77(75.98)  87(86.36)  88(87.52)  88(87.68) 
frb50234  66(65.56)  66(65.94)  66(65.96)  79(77.88)  79(78.66)  79(79)  91(89.62)  92(90.76)  91(91) 
frb50235  67(66.1)  67(66.22)  67(66.14)  79(78.14)  80(79)  80(78.88)  91(90.14)  92(91.18)  92(90.98) 
frb53241  71(69.52)  71(70.8)  71(70.48)  85(83.24)  85(84.24)  85(84)  97(96.38)  99(97.74)  98(97.56) 
frb53242  70(69.02)  70(69.62)  70(69.98)  83(81.26)  83(82.12)  83(82.2)  94(93.34)  96(94.82)  95(94.82) 
frb53243  70(69.16)  70(69.74)  70(69.82)  83(82.06)  83(82.94)  83(82.94)  96(94.78)  97(96.22)  97(96.32) 
frb53244  70(68.64)  70(69.08)  70(68.96)  82(81.66)  84(82.76)  84(82.88)  96(94.22)  96(95.72)  96(95.88) 
frb53245  68(67.72)  68(68)  68(68)  82(80.16)  82(81.34)  82(81.18)  93(92.18)  94(93.48)  94(93.3) 
frb56251  75(73.5)  75(74.12)  75(74.08)  89(87.78)  89(88.88)  89(88.84)  103(101.4)  104(102.92)  104(103.02) 
frb56252  74(73.42)  75(74.46)  75(74.34)  88(87.14)  89(88.54)  89(88.44)  102(100.36)  103(101.96)  102(101.56) 
frb56253  74(72.58)  74(73.3)  74(73.18)  87(85.72)  88(87.16)  88(87.62)  100(98.66)  101(100.02)  101(99.94) 
frb56254  73(72.28)  74(73.12)  74(72.88)  87(85.2)  88(86.94)  88(86.44)  99(98.16)  101(99.66)  100(99.32) 
frb56255  74(72.48)  74(73.04)  74(72.9)  87(85.74)  88(86.92)  87(86.74)  100(98.68)  101(100.12)  101(100.08) 
frb59261  78(77.14)  79(78.04)  79(78.02)  92(91.04)  93(92.76)  93(92.92)  106(105.1)  108(107.06)  107(106.92) 
frb59262  78(77.16)  79(78.08)  78(77.92)  94(91.34)  94(92.88)  93(92.78)  106(105.18)  108(106.84)  107(106.6) 
frb59263  77(76.04)  78(76.84)  77(76.56)  92(90.38)  94(91.72)  92(91.1)  106(104.24)  107(105.64)  106(105.26) 
frb59264  77(76.18)  78(77.12)  77(76.94)  91(90.24)  93(92)  93(91.8)  105(104.02)  107(105.78)  106(105.6) 
frb59265  78(76.24)  78(77.62)  78(77.52)  92(90.08)  93(91.94)  92(91.74)  104(103.38)  107(105.62)  106(105.5) 
frb10040  126(123.82)  127(124.42)  126(124)  149(146.86)  149(147)  149(147.38)  170(168.64)  173(169.88)  170(169.14) 
6.1 Experiment Preliminaries
BDCC and BDCCH and their competitor FDTS are all implemented in C++ and compiled by g++ with ’O3’ option. All experiments are run on an Intel Xeon CPU E74830 v3 @ 2.10GHz with 128 GB RAM server under Ubuntu 16.04.5 LTS. We set the search depth , the stepsize and for BDCC and BDCCH. The initial and cooling rate are set to and respectively for BDCCH. The cutoff time of each instance is set to seconds. All algorithms are executed 50 independently times with different random seeds on each instance with .
6.2 Evaluation on DIMACS and BHOSLIB
We carried out experiments on standard DIMACS and BHOSLIB benchmark to evaluate our algorithms. The DIMACS benchmark taken from the Second DIMACS Implementation Challenge [Johnson and Trick1996]
includes problems from the real world and randomly generated graphs. The BHOSLIB instances are generated randomly based on the model RB in the phase transition area
[Xu et al.2005] and famous for their hardness.Table 1 and Table 2 show the experimental results on these two benchmarks. We report the best size and average size of plex found by our FDTS and algorithms, and compare the average time cost of BDCCH and FDTS if they have the same best and average solution sizes, shown in column . Most DIMACS instances are so easy that all the three algorithms find the samequality solution very quickly, and thus are not reported. The result shows that our algorithm not only finds plexes that FDTS cannot reach on many instances but usually cost less time than FDTS on other instances. Particularly, BDCC dominates on the CXXXX.X domain but is not robust enough on the brock domain. With a hyperheuristic, BDCCH enhances the robustness of BDCC while achieving better performance than FDTS. Remark that for C1000.9 with , san400_0.7.3 with and DSJC1000.5 with , BDCCH is the only algorithm that find plexes of size and respectively in runs.
On the BHOSLIB benchmark, BDCC and BDCCH dominate FDTS on most of the instances. We highlight the frb10040, the hard challenging instance in BHOSLIB. BDCC updates the lower bound of the size of maximum plex and plex on frb10040, indicating its power on large dense graphs. Though BDCCH does not achieve the same performance as BDCC due to the timeconsuming exploration phase of hyperheuristic, it outperforms FDTS on most of these instances.
6.3 Evaluation on Massive Graphs
We also evaluate our algorithm on massive realworld graphs from Network Data Repository [Rossi and Ahmed2015], Thanks to the powerful peeling technique, most of these graphs are reduced significantly and solved in a short time by BDCC and BDCCH. For other instances, our algorithms and FDTS find a solution of the same quality in runs. So we do not report the results. To further assess performance on massive graphs, we choose the stateoftheart exact algorithm named BnB [Gao et al.2018] for comparison. We run BnB with a cutoff time of 10000 seconds for the optimum solution. For the sake of space, we do not report the instances that can be solved by both BnB and BDCCH in a few seconds. Table 3 shows the best solution found by BDCCH and the average time cost of BnB and BDCCH. An item with a symbol “” in column “max” indicates that this is the size of the optimum solution proved by BnB.
The result in Table 3 shows that BDCCH can find an optimum solution on most instances while costing much less time. For the instances that BnB fails to solve in 10000 seconds, BDCCH can return a satisfactory solution within a few seconds.
Instance  k=2  k=4  

V  E  max  BnB  BDCCH  max  BnB  BDCCH  
cacoauthorsdblp.clq  540486  15245729  21.132  0.8642  21.392  0.6715  
iawikiTalk.clq  92117  360767  14.756  0.0379  775.832  0.0822  
infroadusa.clq  23947347  28854312  50.768  6.5526  7  10000  5.8993  
infroadNetCA.clq  1957027  2760388  3.78  0.5032  7  10000  0.4594  
infroadNetPA.clq  1087562  1541514  1.448  0.2505  7  10000  0.2486  
scnasasrb.clq  54870  1311227  820.244  0.0204  24  10000  0.0098  
scpkustk11.clq  87804  2565054  7.348  3.6641  74.808  7.7031  
scpkustk13.clq  94893  3260967  560.98  0.1012  36  10000  0.0792  
scshipsec1.clq  140385  1707759  1.952  0.3167  9311.244  0.3665  
scshipsec5.clq  179104  2200076  38.372  0.0988  24  10000  0.1255  
socfbAanon.clq  3097165  23667394  208.696  32.796  501.744  52.559  
socfbBanon.clq  2937612  20959854  1128.236  21.414  33  10000  22.911  
techasskitter.clq  1694616  11094209  3058.656  1.42  1829.164  1.746  
techRLcaida.clq  190914  607610  2.272  0.103  1371.484  0.138  
webit2004.clq  509338  7178413  15.22  0.437  355.148  0.49  
webuk2005.clq  129632  11744049  18.484  0.484  525.108  0.459  
webwikipedia2009.clq  1864433  4507315  185.24  1.06  32  10000  0.828 
7 Conclusions and Futrue Work
In this paper, we have proposed two heuristics, BLP and DTCC, for the maximum plex problem. Based on BLP and DTCC, we develop a local search algorithm BDCC and further improve it by applying a hyperheuristic strategy for the adding and swapping phase. The experimental result shows that our algorithms achieve high robustness across a broad range of problem instances and update the lower bounds on the size of the maximum plexes on many hard instances. Meanwhile, our algorithm achieve stateoftheart performance on massive realworld.
In the future, we plan to study variants of CC for other combinatorial optimization problems further. Besides, it would be interesting to adapt the ideas in this paper to design local search algorithms for other clique relaxation model.
References
 [Balasundaram et al.2011] Balabhaskar Balasundaram, Sergiy Butenko, and Illya V Hicks. Clique relaxations in social network analysis: The maximum kplex problem. Operations Research, 59(1):133–142, 2011.
 [Boginski et al.2006] Vladimir Boginski, Sergiy Butenko, and Panos M Pardalos. Mining market data: a network approach. Computers & Operations Research, 33(11):3171–3184, 2006.

[Boyan and Moore2000]
Justin Boyan and Andrew W Moore.
Learning evaluation functions to improve optimization by local
search.
Journal of Machine Learning Research
, 1(Nov):77–112, 2000.  [Butenko and Wilhelm2006] Sergiy Butenko and Wilbert E Wilhelm. Cliquedetection models in computational biochemistry and genomics. European Journal of Operational Research, 173(1):1–17, 2006.
 [Cai et al.2011] Shaowei Cai, Kaile Su, and Abdul Sattar. Local search with edge weighting and configuration checking heuristics for minimum vertex cover. Artificial Intelligence, 175(910):1672–1696, 2011.
 [Conte et al.2018] Alessio Conte, Tiziano De Matteis, Daniele De Sensi, Roberto Grossi, Andrea Marino, and Luca Versari. D2k: Scalable community detection in massive networks via smalldiameter kplexes. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1272–1281. ACM, 2018.
 [Gao et al.2018] Jian Gao, Jiejiang Chen, Minghao Yin, Rong Chen, and Yiyuan Wang. An exact algorithm for maximum kplexes in massive graphs. In IJCAI, pages 1449–1455, 2018.
 [Glover and Laguna1998] Fred Glover and Manuel Laguna. Tabu search. In Handbook of combinatorial optimization, pages 2093–2229. Springer, 1998.
 [Gujjula et al.2014] Krishna Reddy Gujjula, Krishnan Ayalur Seshadrinathan, and Amirhossein Meisami. A hybrid metaheuristic for the maximum kplex problem. In Examining Robustness and Vulnerability of Networked Systems, pages 83–92. 2014.
 [Johnson and Trick1996] David S Johnson and Michael A Trick. Cliques, coloring, and satisfiability: second DIMACS implementation challenge, October 1113, 1993, volume 26. American Mathematical Soc., 1996.
 [Lakhlef2015] Hicham Lakhlef. A multilevel clustering scheme based on cliques and clusters for wireless sensor networks. Computers & Electrical Engineering, 48:436–450, 2015.
 [McClosky and Hicks2012] Benjamin McClosky and Illya V Hicks. Combinatorial algorithms for the maximum kplex problem. Journal of combinatorial optimization, 23(1):29–49, 2012.
 [Miao and Balasundaram2017] Zhuqi Miao and Balabhaskar Balasundaram. Approaches for finding cohesive subgroups in largescale social networks via maximum kplex detection. Networks, 69(4):388–407, 2017.
 [Rossi and Ahmed2015] Ryan A. Rossi and Nesreen K. Ahmed. The network data repository with interactive graph analytics and visualization. In Proceedings of the TwentyNinth AAAI Conference on Artificial Intelligence, January 2530, 2015, Austin, Texas, USA., pages 4292–4293, 2015.
 [Russell and Norvig2016] Stuart J Russell and Peter Norvig. Artificial intelligence: a modern approach. Malaysia; Pearson Education Limited,, 2016.
 [Sutton and Barto1998] Richard S Sutton and Andrew G Barto. Introduction to reinforcement learning, volume 135. MIT press Cambridge, 1998.
 [Wang et al.2016] Yiyuan Wang, Shaowei Cai, and Minghao Yin. Two efficient local search algorithms for maximum weight clique problem. In AAAI, pages 805–811, 2016.
 [Wang et al.2017a] Yiyuan Wang, Shaowei Cai, and Minghao Yin. Local search for minimum weight dominating set with twolevel configuration checking and frequency based scoring function. J. Artif. Intell. Res., 58:267–295, 2017.
 [Wang et al.2017b] Yiyuan Wang, Dantong Ouyang, Liming Zhang, and Minghao Yin. A novel local search for unicost set covering problem using hyperedge configuration checking and weight diversity. Science China Information Sciences, 60(6):062103, 2017.
 [Wang et al.2018] Yiyuan Wang, Shaowei Cai, Jiejiang Chen, and Minghao Yin. A fast local search algorithm for minimum weight dominating set problem on massive graphs. In Proceedings of the TwentySeventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 1319, 2018, Stockholm, Sweden., pages 1514–1522, 2018.
 [Xiao et al.2017] Mingyu Xiao, Weibo Lin, Yuanshun Dai, and Yifeng Zeng. A fast algorithm to compute maximum kplexes in social network analysis. In AAAI, pages 919–925, 2017.
 [Xu et al.2005] Ke Xu, Frederic Boussemart, Fred Hemery, and Christophe Lecoutre. A simple model to generate hard satisfiable instances. arXiv preprint cs/0509032, 2005.
 [Zhou and Hao2017] Yi Zhou and JinKao Hao. Frequencydriven tabu search for the maximum splex problem. Computers & Operations Research, 86:65–78, 2017.
 [Zhou et al.2018] Yangming Zhou, Béatrice Duval, and JinKao Hao. Improving probability learning based local search for graph coloring. Applied Soft Computing, 65:542–553, 2018.
Comments
There are no comments yet.