1 Introduction
Let be an undirected graph and . The induced graph contains exactly the nodes and those edges of whose incident nodes are both in . If is a path, it is called an induced path. The length of a longest induced path is also referred to as the induced detour number which was introduced more than 30 years ago [8]. We denote the problem of finding such a path by LongestInducedPath. It is known to be complete, even on bipartite graphs [17].
The LongestInducedPath problem has applications in molecular physics, the analysis of social, telecommunication, and more general transportation networks [7, 25, 3, 32] as well as pure graph and complexity theory: It is closely related to the graph diameter—the longest among all shortest paths between any two nodes, which is a commonly analyzed communication property of social networks [29]. A longest induced path witnesses the largest diameter that may occur by the deletion of any node subset in a node failure scenario [29]. The treedepth of a graph is the minimum depth over all of its depthfirstsearch trees, and constitutes an upper bound on its treewidth [6], which is a wellestablished measure in parameterized complexity and graph theory. Recently, it was shown that any graph class with bounded degree has bounded induced detour number iff it has bounded treedepth [31]. Further, the enumeration of induced paths can be used to predict nuclear magnetic resonance [35].
LongestInducedPath is not only complete, but also [2]complete [9] and does not allow a polynomial approximation, unless [24, 5]. On the positive side, it can be solved in polynomial time for several graph classes, e.g., those of bounded mimwidth (which includes interval, biinterval, circular arc, and permutation graphs) [26] as well as boundedhole, intervalfilament, and other decomposable graphs [18]. Furthermore, there are complete problems, such as Coloring for [22] and Independent Set [28], are polynomial time solvable on graphs with bounded induced detour number.
Recently the first nontrivial, general algorithms to solve the LongestInducedPath problem exactly were devised by Matsypura et al. [29]. There, three different integer linear programming (ILP) formulations were proposed: the first searches for a subgraph with largest diameter; the second utilizes properties derived from the average distance between two nodes of a subgraph; the third models the path as a walk in which no shortcuts can be taken. Matsypura et al. show that the latter (see below for details) is the most effective in practice.
1.0.1 Contribution.
In Section 3, we propose novel ILP formulations based on cut and subtour elimination constraints. We obtain strictly stronger relaxations than those proposed in [29] and describe a way to strengthen them even further in Section 4. After discussing some algorithmic considerations in Section 5, we show in Section 6 that our most effective models are also superior in practice.
2 Preliminaries
2.0.1 Notation.
For , let . Throughout this paper, we consider a connected, undirected, simple graph as our input. Edges are cardinalitytwo subsets of . If there is no ambiguity, we may write for an edge . Given a graph , we refer to its nodes (edges) by (, respectively). Given a cycle in , a chord is an edge connecting two nodes of that are not neighbors along .
2.0.2 Linear programming (cf., e.g., [34]).
A linear program
(LP) consists of a cost vector
together with a set of linear inequalities, called constraints, that define a polyhedron in . We want to find a point that maximizes the objective function . This can be done in polynomial time. Unless , this is no longer true when restricting to have integral components; the somodified problem is an integer linear program (ILP). Conversely, the LP relaxation of an ILP is obtained by dropping the integrality constraints on the components of . The optimal value of an LP relaxation is a dual bound on the ILP’s objective; e.g., an upper bound for maximization problems. As there are several ways to model a given problem as an ILP, one aims for models that yield small dimensions and strong dual bounds, to achieve good practical performance. This is crucial, as ILP solvers are based on a branchandbound scheme that relies on iteratively solving LP relaxations to obtain dual bounds on the ILP’s objective. When a model contains too many constraints, it is often sufficient to use only a reasonably sized constraint subset to achieve provably optimal solutions. This allows us to add constraints during the solving process, which is called separation. We say that model is at least as strong as model , if for all instances, the LP relaxation’s value of model is no further from the ILP optimum than that of . If there also exists an instance for which ’s LP relaxation yields a tighter bound than that of , then is stronger than .When referring to models, we use the prefix “” with an appropriate subscript. When referring to their respective LP relaxations we write “” instead.
2.0.3 Walkbased model (stateoftheart).
Recently Matsypura et al. [29] proposed an ILP model, , that is the foundation of the fastest known exact algorithm (called A3c therein) for LongestInducedPath. They introduce timesteps, and for every node and timestep they introduce a variable that is iff is visited at time . Constraints guarantee that nodes at nonconsecutive time points cannot be adjacent. We recapitulate details in Appendix 0.A. Unfortunately, yields only weak LP relaxations (cf. [29] and Section 4). To achieve a practical algorithm, Matsypura et al. iteratively solve for an increasing number of timesteps until the path found does not use all timesteps, i.e., a nontrivial dual bound is encountered. In contrast to [29], we consider the number of edges in the path (instead of nodes) as the objective value.
3 New Models
We aim for models that exhibit stronger LP relaxations and are practically solvable via single ILP computations. To this end, we consider what we deem a more natural variable space. We start by describing a partial model , which by itself is not sufficient but constitutes the core of our new models. To obtain a full model, , we add constraints that prevent subtours.
For notational simplicity, we augment to by adding a new node that is adjacent to all nodes of . Within , we look for a longest induced cycle through , where we ignore induced chords incident to . Searching for a cycle instead of a path, allows us to homogeneously require that each selected edge, i.e., edge in the solution, has exactly two adjacent edges that are also selected. Let denote the edges adjacent to edge in . Each binary variable is iff edge is selected. We denote the partial model below by : equationparentequation
(1a)  
s.t.  (1b)  
(1c)  
(1d) 
Constraint (1b) requires to select exactly two edges incident with . To prevent chords, constraints (1c) enforce that any (original) edge (even if not selected itself!) is adjacent to at most two selected edges; if is selected, precisely two of its adjacent edges need to be selected as well.
3.0.1 Establishing connectivity.
The above model is not sufficient: it allows for the solution to consist of multiple disjoint cycles, only one of which contains . But still, these cycles have no chords in , and no edge in connects any two cycles. To obtain a longest single cycle through —yielding the longest induced path —we thus have to forbid additional cycles in the solutions that are not containing . In other words, we want to enforce that the graph induced by the variables is connected.
There are several established ways to achieve connectivity: To stay with compact (i.e., polynomially sized) models, we could, e.g., augment with MillerTuckerZemlin constraints (which are known to be polyhedrally weak [4]) or multicommodityflow formulations (; cf. Appendix 0.B). However, herein we focus on augmenting with cut or (generalized) subtour elimination constraints, resulting in the (noncompact) model we denote by , see below for details. Such constraints are a cornerstone of many algorithms for diverse problems where they are typically superior (in particular in practice) than other known approaches [33, 16, 15]. While and are polyhedrally equally strong (cf. Section 4), we know from other problems that the sheer size of the latter typically nullifies the potential benefit of its compactness. Preliminary experiments show that this is indeed the case here as well.
3.0.2 Cut model (and generalized subtour elimination).
Let be the set of edges in the cut induced by . For notational simplicity, we may omit braces when referring to node sets of cardinality one. We obtain by adding cut constraints to : equationparentequation
(2a)  
These constraints ensure that if a node is incident to a selected edge (by (1c) there are then two such selected edges), any cut separating from contains at least two selected edges, as well. Thus, there are (at least) two edgedisjoint paths between and selected. Together with the cycle properties of , we can deduce that all selected edges form a common cycle through . 
An alternative view leads to subtour elimination constraints for , which prohibit cycles not containing via counting. It is well known that these constraints can be generalized using binary node variables that indicate whether node participates in the solution (in our case: in the induced path) [20]. Generalized subtour elimination constraints thus take the form
(2b) 
One expects and “ with constraints (2b)” to be equally strong as this is wellknown for standard Steiner tree, and other related models [21, 11, 12]. In fact, there even is a direct onetoone correspondence between cut constraints (2a) and generalized subtour elimination constraints (2b): By substituting nodevariables with their definitions in (2b), we obtain . A simple rearrangement yields the corresponding cut constraint (2a).
3.0.3 Clique constraints.
We further strengthen our above models by introducing a set of additional inequalities. Consider any clique (i.e., complete subgraph) in . The induced path may contain at most one of its edges:
(3) 
4 Polyhedral Properties of the LP Relaxations
We compare the above models w.r.t. the strength of their LP relaxations, i.e., the quality of their dual bounds. Achieving strong dual bounds is a highly relevant goal also in practice: one can expect a lower running time for the ILP solvers in case of better dual bounds since fewer nodes of the underlying branchandbound tree have to be explored. We defer the proofs of this section to Appendix 0.C.
Since requires some upper bound on the objective value, we can only reasonably compare this model to ours by assuming that we are also given this bound as an explicit constraint. Hence, no dual bound of any of the considered models gives a worse (i.e., larger) bound than . As has already been observed in [29], in fact always yields this worst case bound:
Proposition 1
(Proposition 5 from [29]) For every instance and every number of timesteps has objective value .
Note that Proposition 1 is independent of the graph. Given that the longest induced path of a complete graph has length , we also see that the integrality gap of is unbounded. Furthermore, this shows that cannot be weaker than . We show that already the partial model is in fact stronger than . Let therefore , where is the instance’s (integral) optimum value.
Proposition 2
is stronger than . Moreover, for every there is an infinite family of instances on which has objective value at most and has objective value at least .
Since only has additional constraints compared to , this implies that is also stronger than . In fact, since constraints (2a) cut off infeasible integral points contained in , is clearly even a strict subset of . As noted before, we can show that using a multicommodityflow scheme (cf. Appendix 0.B) results in LP relaxations equivalent to :
Proposition 3
and are equally strong.
Let denote with clique constraints added for all cliques on at most nodes. We show that increasing the clique sizes yields a hierarchy of ever stronger models.
Proposition 4
For any , is stronger than .
5 Algorithmic Considerations
5.0.1 Separation.
Since contains an exponential number of cut constraints (2a), it is not practical in its full form. We follow the traditional separation pattern for branchandcutbased ILP solvers: We initially omit cut constraints (2a), i.e., we start with model . Iteratively, given a feasible solution to the LP relaxation of , we seek violated cut constraints and add them to . If no such constraints are found and the solution is integral, we have obtained a solution to . Otherwise, we proceed by branching or—given a sophisticated branchandcut framework—by more general techniques.
Given an LP solution , we call an edge active if . Similarly, we say that a node is active, if it has an active incident edge. These active graph elements yield a subgraph of . For integral LP solutions, we simply compute the connected components of and add a cut constraint for each component that does not contain . We refer to this routine as integral separation. For a fractional LP solution, we compute the maximum flow value between and each active node in ; the capacity of an edge is equal to . If , a cut constraint based on the induced minimum cut is added. We call this routine fractional separation. Both routines manage to find a violated constraint if there is any, i.e., they are exact separation routines. In fact, this shows that an optimal solution to can be computed in polynomial time [23]. Note that already integral separation suffices to obtain an exact, correct algorithm—we simply may need more branching steps than with fractional separation.
5.0.2 Relaxing variables.
As presented above, our models have binary variables, each of which may be used for branching by the ILP solver. We can reduce this number, by introducing new binary variables , , that allow us to relax the binary variables, , to continuous ones. The new variables are precisely those discussed w.r.t. generalized subtour elimination, i.e., we require . Assuming to be continuous in , we have for every edge if or then . Conversely, if then by (1c). Hence, requiring integrality for the variables (and, e.g., branching only on them), suffices to ensure integral values.
5.0.3 Handling clique constraints.
We use a modified version of the BronKerbosch algorithm [14] to list all maximal cliques. For each such clique we add a constraint during the construction of our model. Recall that there are up to maximal cliques [30]
, but preliminary tests show that this effort is negligible compared to solving the ILP. Thus, as our preliminary tests also show, other (heuristic) approaches of adding clique constraints to the initial model are not worthwhile.
6 Computational Experiments
6.0.1 Algorithms.
We implement the best stateoftheart algorithm, i.e., the based one by Matsypura et al. as briefly described in Section 2 and Appendix 0.A. We denote this algorithm by “W”. For our implementations of , we consider various parameter settings w.r.t. to the algorithmic considerations described in Section 5. We denote the arising algorithms by “C” to which we attach sub and superscripts defining the parameters: the subscript “” denotes that we use fractional separation in addition to integral separation. The superscript “” specifies that we introduce node variables as the sole integer variables. The superscript “” specifies that we use clique constraints. We consider all eight thereby possible implementations.
6.0.2 Hard and software.
Our C++ (GCC 8.3.0) code uses SCIP 6.0.1 [19] as the BranchandCutFramework with CPLEX 12.9.0 as the LP solver. We use OGDF snapshot20180328 [10], in particular its pushrelabel implementation, for the separation of cut constraints. We use igraph 0.7.1 [13] to calculate all maximal cliques. For W, we directly use CPLEX instead of SCIP as the BranchandCutFramework. This does not give an advantage to our algorithms, since CPLEX is more than twice as fast as SCIP [1] and we confirmed in preliminary tests that CPLEX is faster on . However, we use SCIP for our algorithms, as it allows better parameterizible userdefined separation routines. We run all tests on an Intel Xeon Gold 6134 with 3.2 GHz and 256 GB RAM running Debian 9. We limit each test instance to a single thread with a time limit of minutes and a memory limit of GB.
6.0.3 Instances.
We consider the instances proposed for LongestInducedPath in [29] as well as additional ones. Overall, our test instances are grouped into four sets: RWC, MG, BAS and BAL. The first set, denoted RWC, is a collection of 22 realworld networks, including communication and social networks of companies and of characters in books, as well as transportation, biological, and technical networks. See [29] for details on the selection. The Movie Galaxy (MG) set consists of 773 graphs representing social networks of movie characters [27]. While [29] considered only 17 of them, we use the full set here. The other two sets are based on the BarabásiAlbert probabilistic model for scalefree networks [2]. In [29], only the chosen parameter values are reported, not the actual instances. Our set BAS recreates instances with the same values: 30 graphs for each choice , where is the graph’s density. As we will see, these small instances are rather easy for our models. We thus also consider a set BAL of graphs on 100 nodes; for each density we generate 30 instances. See http://tcs.uos.de/research/lip for all instances, their sources, and detailed experimental results.
6.0.4 Comparison to the stateoftheart.
instance  W  C  C  C  C  C  C  C  C  

hightech  13  33  91  15.  40  0.  90  1.  11  1.  44  3.  15  0.  51  0.  81  0.  41  2.  05 
karate  9  34  78  2.  98  1.  73  1.  65  2.  12  1.  32  1.  07  3.  71  0.  66  2.  74 
mexican  16  35  117  73.  30  1.  68  2.  25  1.  12  3.  59  1.  22  1.  34  0.  87  0.  99 
sawmill  18  36  62  70.  00  0.  51  0.  43  0.  50  0.  44  0.  85  3.  32  0.  82  3.  34 
tailorS1  13  39  158  83.  80  4.  78  7.  92  4.  81  6.  45  1.  51  1.  87  3.  29  3.  55 
chesapeake  16  39  170  106.  00  1.  84  13.  11  2.  11  11.  00  2.  29  4.  88  3.  19  4.  39 
tailorS2  15  39  223  445.  00  6.  80  21.  78  11.  92  14.  91  3.  20  4.  31  2.  89  3.  14 
attiro  31  59  128  🕒  1.  76  2.  57  2.  48  1.  75  1.  20  1.  75  0.  89  1.  19  
krebs  17  62  153  522.  00  3.  86  28.  21  18.  55  10.  03  16.  00  11.  26  3.  90  2.  33 
dolphins  24  62  159  🕒  7.  95  27.  59  22.  72  18.  33  19.  21  2.  99  3.  01  4.  70  
prison  36  67  142  🕒  13.  36  5.  87  1.  09  1.  50  3.  62  4.  05  1.  02  1.  02  
huck  9  69  297  41.  70  🕒  144.  13  19.  46  42.  22  114.  27  11.  63  5.  96  7.  49  
sanjuansur  38  75  144  🕒  30.  67  8.  64  24.  86  10.  33  8.  22  3.  65  3.  79  4.  71  
jean  11  77  254  121.  00  464.  89  52.  89  16.  54  9.  53  81.  03  14.  47  3.  88  5.  14 
david  19  87  406  🕒  666.  25  719.  46  26.  70  45.  34  85.  88  23.  94  6.  93  10.  35  
ieeebus  47  118  179  🕒  37.  10  22.  35  39.  82  10.  60  15.  69  3.  13  22.  72  5.  61  
sfi  13  118  200  44.  40  47.  41  4.  39  4.  89  3.  77  15.  13  2.  64  3.  31  2.  44 
anna  20  138  493  🕒  21.  58  296.  69  53.  21  74.  55  439.  23  20.  27  7.  09  7.  58  
usair  46  332  2126  🕒  🕒  🕒  🕒  🕒  🕒  🕒  922.  94  🕒  
494bus  142  494  586  🕒  🕒  379.  29  🕒  379.  97  🕒  178.  92  🕒  170.  74 

(a),(b): Each point is a median, where timeouts are treated as seconds. Bars in the background give the number of instances. Gray encircled markers, connected via dotted lines, show the number of solved instances (if not 100%).
(c): Whiskers mark the 20% and 80% percentile. The gray area marks timeouts.
We start with the most obvious question: Are the new models practically more effective than the stateoftheart? See Fig. 0(a) for BAS and BAL, Fig. 0(b) for MG, and Table 1 for RWC.
We observe that rather independent of the benchmark set, the various implementations achieve the best running times and success rates. The only exceptions are the instances from MG (cf. Fig. 0(b)): there, the overhead of the stronger model, requiring an explicit separation routine, does not pay off and W yields comparable performance to the weaker of the cutbased variants. On BAS instances, the cutbased variants dominate (cf. Fig. 0(a)): while all variants (see below) solve all of BAS, W can only solve the instances for reliably. On BAL (cf. Fig. 0(a)) W fails on virtually all instances. The cutbased model, however, allows implementations (see below for details) that solve all of these harder instances. We point out one peculiarity on the BAL instances, visible in Fig. 0(a). The instances have 100 nodes but varying density. As the density increases from 2 to 30, the median running times of all algorithmic variants increase and the median success rates decrease. However, from to (where only C is successful) the running times drop again and the success rate increases. Interestingly, the number of branchandbound (B&B) nodes for is only roughly 1/7 of those for . This suggests that the denser graphs may allow fewer (near)optimal solutions and thus more efficient pruning of the search tree.
6.0.5 Comparison of cutbased implementations.
Choosing the best among the eight implementations is not as clear as the general choice of over . In Fig. 0(a), 0(b), and Table 1 we see that, while adding clique constraints is clearly beneficial on MG, on BAS and RWC the benefit is less clear. On BAL, we do not see a benefit and for we even see a clear benefit of not using clique constraints. Each of the graphs from BAL with has at least maximal cliques—and therefore initial clique constraints—, whereas the BAL graphs for and the RWC graphs yeast and usair have at most maximal cliques and all other graphs have at most .
The probably most surprising finding is the choice of the separation routine: while the fractional variant is a quite fast algorithm and yields tighter dual bounds, the simpler integral separation performs better in practice. This is in stark contrast to seemingly similar scenarios like TSP or Steiner problems, where the former is considered by default. In our case, the latter—being very fast and called more rarely—is seemingly strong enough to find effective cutting planes that allow the ILP solver to achieve its computations fastest. This is particularly true when combined with the addition of node variables (see below). In fact,
C is the only choice that can completely solve all large graphs in BAL.Adding node variables (and relaxing the integrality on the edge variables) nearly always pays off significantly (cf. Fig. 0(a), 0(b)). Fig. 0(d) shows that the models without node variables require many more B&Bnodes. In fact, looking more deeply into the data, C requires roughly as few B&Bnodes as C without requiring the overhead of the more expensive separation routine. Only for BAS with , the configurations without node variables are faster; on these instances, our algorithms only require – B&Bnodes (median).

6.0.6 Dependency of running time on the optimal value.
Since the instances optimal value determines the final size of the instance, it is natural to expect the running time of W to heavily depend on . Fig. 0(c) shows that this is indeed the case. The new models are less dependent on the solution size, as, e.g., witnessed by C in the same figure.
6.0.7 Practical strength of the root relaxations.
For our new models, we may ask how the integer optimal solution value and the value of the LP relaxation (obtained by any cutbased implementation with exact fractional separation) differ, see Fig. 1(a). The gap increases for larger values of . Interestingly, we observe that the density of the instance seems to play an important role: for BAS and BAL, the plot shows obvious clusters, which—without a single exception—directly correspond to the different parameter settings as labeled. Denser graphs lead to weaker LP bounds in general.
Fig. 1(b) shows the relative improvement to the LP relaxation when adding clique constraints for MG instances. On the other hand for every instance of BAS and BAL the root relaxation did not change by adding clique constraints.
7 Conclusion
We propose new ILP models for LongestInducedPath and prove that they yield stronger relaxations in theory than the previous stateoftheart. Moreover, we show that they—generally, but also in particular in conjunction with further algorithmic considerations—clearly outperform all known approaches in practice. We also provide strengthening inequalities based on cliques in the graph and prove that they form a hierarchy when increasing the size of the cliques.
It could be worthwhile to separate the proposed clique constraints (at least heuristically) to take advantage of their theoretical properties without overloading the initial model with too many such constraints. As it is unclear how to develop an efficient such separation scheme, we leave it as future research.
References
 [1] Achterberg, T.: SCIP: solving constraint integer programs. Math. Prog. Comput. 1(1), 1–41 (2009)
 [2] Barabási, A.L., Albert, R.: Emergence of Scaling in Random Networks. Science 286, 509–512 (1999)
 [3] Barabási, A.L.: Network Science. Cambridge University Press (2016)
 [4] Bektaş, T., Gouveia, L.: Requiem for the MillerTuckerZemlin subtour elimination constraints? EJOR 236(3), 820–832 (2014)
 [5] Berman, P., Schnitger, G.: On the Complexity of Approximating the Independent Set Problem. Inf. Comput. 96(1), 77–94 (1992)
 [6] Bodlaender, H.L., Gilbert, J.R., Hafsteinsson, H., Kloks, T.: Approximating Treewidth, Pathwidth, Frontsize, and Shortest Elimination Tree. J. Alg. 18(2), 238–255 (1995)
 [7] Borgatti, S.P., Everett, M.G., Johnson, J.C.: Analyzing Social Networks. SAGE Publishing (2013)
 [8] Buckley, F., Harary, F.: On longest induced paths in graphs. Chinese Quart. J. Math. 3(3), 61–65 (1988)
 [9] Chen, Y., Flum, J.: On Parameterized Path and Chordless Path Problems. In: CCC. pp. 250–263 (2007)
 [10] Chimani, M., Gutwenger, C., Juenger, M., Klau, G.W., Klein, K., Mutzel, P.: The Open Graph Drawing Framework (OGDF). In: Tamassia, R. (ed.) Handbook on Graph Drawing and Visualization, pp. 543–569. Chapman and Hall/CRC (2013), www.ogdf.net
 [11] Chimani, M., Kandyba, M., Ljubić, I., Mutzel, P.: Obtaining Optimal cardinality Trees Fast. J. Exp. Alg. 14, 5:2.5–5:2.23 (2010)
 [12] Chimani, M., Kandyba, M., Ljubić, I., Mutzel, P.: Strong Formulations for NodeConnected Steiner Network Problems. In: COCOA. pp. 190–200. LNCS 5165 (2008)
 [13] Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal, Complex Systems 1695, 1–9 (2006), http://igraph.sf.net
 [14] Eppstein, D., Löffler, M., Strash, D.: Listing All Maximal Cliques in Sparse Graphs in NearOptimal Time. In: ISAAC. pp. 403–414. LNCS 6506 (2010)
 [15] Fischetti, M.: Facets of two Steiner arborescence polyhedra. Math. Prog. 51, 401–419 (1991)
 [16] Fischetti, M., SalazarGonzalez, J., Toth, P.: The Generalized Traveling Salesman and Orienteering Problems. In: The Traveling Salesman Problem and Its Variations, Comb. Opt., vol. 12. Springer (2007)
 [17] Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. W. H. Freeman & Co. (1979)
 [18] Gavril, F.: Algorithms for maximum weight induced paths. Inf. Process. Let. 81(4), 203–208 (2002)
 [19] Gleixner, A., Bastubbe, M., Eifler, L., Gally, T., Gamrath, G., Gottwald, R.L., Hendel, G., Hojny, C., Koch, T., Lübbecke, M.E., Maher, S.J., Miltenberger, M., Müller, B., Pfetsch, M.E., Puchert, C., Rehfeldt, D., Schlösser, F., Schubert, C., Serrano, F., Shinano, Y., Viernickel, J.M., Walter, M., Wegscheider, F., Witt, J.T., Witzig, J.: The SCIP Optimization Suite 6.0. ZIBReport 1826, Zuse Inst. Berlin (2018), https://scip.zib.de
 [20] Goemans, M.X.: The steiner tree polytope and related polyhedra. Math. Prog. 63, 157–182 (1994)
 [21] Goemans, M.X., soo Myung, Y.: A Catalog of Steiner Tree Formulations. Networks 23, 19–28 (1993)
 [22] Golovach, P.A., Paulusma, D., Song, J.: Coloring graphs without short cycles and long induced paths. Disc. Appl. Math. 167, 107–120 (2014)

[23]
Grötschel, M., Lovász, L., Schrijver, A.: Geometric Algorithms and Combinatorial Optimization, Alg. and Comb., vol. 2. Springer (1988)
 [24] Håstad, J.: Clique is hard to approximate within . Acta Math. 182(1), 105–142 (1999)
 [25] Jackson, M.O.: Social and Economic Networks. Princeton University Press (2010)
 [26] Jaffke, L., Kwon, O., Telle, J.A.: PolynomialTime Algorithms for the Longest Induced Path and Induced Disjoint Paths Problems on Graphs of Bounded MimWidth. In: IPEC. pp. 21:1–13. LIPIcs 89 (2017)
 [27] Kaminski, J., Schober, M., Albaladejo, R., Zastupailo, O., Hidalgo, C.: Moviegalaxies  Social Networks in Movies. Harvard Dataverse (V3 2018)
 [28] Lozin, V., Rautenbach, D.: Some results on graphs without long induced paths. Inf. Process. Let. 88(4), 167–171 (2003)
 [29] Matsypura, D., Veremyev, A., Prokopyev, O.A., Pasiliao, E.L.: On exact solution approaches for the longest induced path problem. EJOR 278, 546–562 (2019)
 [30] Moon, J.W., Moser, L.: On Cliques in Graphs. Israel J. of Math. 3(1), 23–28 (1965)
 [31] Nesetril, J., de Mendez, P.O.: Sparsity  Graphs, Structures, and Algorithms, Alg. and Comb., vol. 28. Springer (2012)
 [32] Newman, M.: Networks: An Introduction. Oxford University Press (2010)
 [33] Polzin, T.: Algorithms for the Steiner problem in networks. Ph.D. thesis, Saarland University, Saarbrücken, Germany (2003)
 [34] Schrijver, A.: Theory of linear and integer programming. WileyIntersci. series in disc. math. and opt., Wiley (1999)
 [35] Uno, T., Satoh, H.: An Efficient Algorithm for Enumerating Chordless Cycles and Chordless Paths. In: Int. Conf. on Disc. Sci. pp. 313–324. LNCS 8777 (2014)
Appendix
Appendix 0.A WalkBased Model (StateoftheArt)
The following ILP model, denoted by , was recently presented in [29]. It constitutes the foundation of the fastest known exact algorithm. It models a timed walk through the graph that prevents “shortcut” edges. Let denote an upper bound on the length of the path, i.e., on its number of edges. For every node and every point in time there is a variable that is iff is visited at time (4g). equationparentequation
(4a)  
s.t.  (4b)  
(4c)  
(4d)  
(4e)  
(4f)  
(4g)  
In every step at most one node can be visited (4b); a node can be visited at most once (4c); the time points have to be used consecutively (4d); nodes visited at consecutive time points need to be adjacent (4e); and nodes at nonconsecutive time points cannot be adjacent (4f). 
However, yields only weak LP relaxations (cf. Section 4). To obtain a practical algorithm, the authors of [29] iteratively solve for increasing values of until its optimal objective value becomes less than . They use the graph’s diameter as a lower bound on to avoid trivial calls. In addition, they add supplemental symmetry breaking inequalities.
Appendix 0.B MultiCommodityFlow Model
A flow formulation allows a compact, i.e., polynomiallysized, model. We start with and extend it in the following way: Each node is assigned a commodity and sends—if is part of the induced path—two units of flow of this commodity from to using only selected edges, where edges have capacity one (per commodity). This ensures that each node in the solution lies on a common cycle with . Consider the bidirected arc set that consists of a directed arc for both directions of each edge in . Let () denote the arcs of with source (resp. target) . We use variables to model the flow of commodity over arc ; we do not actively require them to be binary. The below model, together with , forms . equationparentequation
(5a)  
(5b)  
(5c) 
The capacity constraints (5a) ensure that flow is only sent over selected edges. Equations (5b) model flow preservation (up to, but not including, the sink ) and send the commodities away from their source , if is part of the solution.
Appendix 0.C Proofs for Section 4 (Polyhedral Properties)
Proposition 1
(Proposition 5 from [29]) For every instance and every number of timesteps has objective value .
Proof
We set to for all and . It is easy to see that this solution is feasible and attains the claimed objective value. ∎
Proposition 2
is stronger than . Moreover, for every there is an infinite family of instances on which has objective value at most and has objective value at least .
Proof
By Proposition 1, will always attain value . To show the strength claim, it thus suffices to give instances where yields a strictly tighter bound.
Already a star with at least three leaves proves the claim, as guarantees a solution of optimal value . However, it can be argued that such graphs and substructures are easy to preprocess. Thus, we prove the claim with a more suitable instance class.
Choose any , start with two nodes , connect them with internally nodedisjoint paths of length 2, and add new node with edge . A longest induced path in this graph contains exactly edges: and the two edges of one of the paths. Let denote the degree of node in without added star . By summing all constraints (1c) we deduce
For the double sum we see that any edge incident to or is considered times, i.e., it has adjacent edges, while the other edges are considered times. Thus . In the second sum , is the only edge with coefficient (instead of ), and we thus have . By (1b) and the variable bounds we have . Since we overall have , giving objective value . As the objective must be integral, this even yields the optimal bound when using within an ILP solver.
We furthermore note that, to achieve strictly twoconnected graphs, we could, e.g., also consider a cycle where each edge is replaced by two internally nodedisjoint paths of length 2. However, in the above instance class the gap between the relaxations is larger, which is why we refrain from giving further details to the latter class. ∎
Proposition 3
and are equally strong.
Proof
Let and be the polytope of and , respectively. Let be the projection of onto the variables by ignoring the variables. Then . We show that the projection is surjective. Clearly, it retains the objective value. We observe that by constraints (5a) for any node there can be at most units of flow along edge that belong to some commodity . By constraint (5b), each node sends units of flow that have to arrive at node . Consequently, the claim—both that any solution maps to an solution and vice versa—follows directly from the duality of maxflow and mincut. ∎
Proposition 4
For any , is stronger than .
Proof
is as least as strong as as we only add new constraints. Let , the complete graph on nodes. By choosing in constraint (3), has objective value .
However, allows a solution with objective value : We set for each and for each to obtain an LP feasible solution to : Clearly, constraints (1b,1c) are satisfied. The cut constraints (2a) are satisfied since edge variables are chosen uniformly (w.r.t. the two above edge types) and the righthand side of the constraint sums over at least as many edge variables (per type) as the lefthand side. For any clique of size at most , the lefthand side of its clique constraint (3) sums up to at most .
We note that it is straightforward to generalize , so that it contains only as a subgraph, while retaining the property of having a gap between the two considered LPs. ∎
Comments
There are no comments yet.