# Extended formulations from communication protocols in output-efficient time

Deterministic protocols are well-known tools to obtain extended formulations, with many applications to polytopes arising in combinatorial optimization. Although constructive, those tools are not output-efficient, since the time needed to produce the extended formulation also depends on the size of the slack matrix (hence, of the exact description in the original space). We give general sufficient conditions under which those tools can be implemented as to be output-efficient, showing applications to e.g. Yannakakis' extended formulation for the stable set polytope of perfect graphs, for which, to the best of our knowledge, an efficient construction was previously not known. For specific classes of polytopes, we give also a direct, efficient construction of those extended formulations. Finally, we deal with extended formulations coming from unambiguous non-deterministic protocols.

## Authors

• 9 publications
• 15 publications
• ### Extended formulations for matroid polytopes through randomized protocols

Let P be a polytope. The hitting number of P is the smallest size of a h...
06/23/2021 ∙ by Manuel Aprile, et al. ∙ 0

• ### Extended Formulations for Radial Cones

This paper studies extended formulations for radial cones at vertices of...
05/25/2018 ∙ by Matthias Walter, et al. ∙ 0

• ### Binary extended formulations and sequential convexification

A binarization of a bounded variable x is a linear formulation with vari...
06/01/2021 ∙ by Manuel Aprile, et al. ∙ 0

• ### Extended Formulations for Stable Set Polytopes of Graphs Without Two Disjoint Odd Cycles

Let G be an n-node graph without two disjoint odd cycles. The algorithm ...
11/27/2019 ∙ by Michele Conforti, et al. ∙ 0

• ### Convex Hulls for Graphs of Quadratic Functions With Unit Coefficients: Even Wheels and Complete Split Graphs

We study the convex hull of the graph of a quadratic function f(𝐱)=∑_ij∈...
07/11/2020 ∙ by Mitchell Harris, et al. ∙ 0

• ### Coalgebraic Tools for Randomness-Conserving Protocols

We propose a coalgebraic model for constructing and reasoning about stat...
07/08/2018 ∙ by Dexter Kozen, et al. ∙ 0

• ### Free surface flow through rigid porous media – An overview and comparison of formulations

In many applications free surface flow through rigid porous media has to...
06/24/2021 ∙ by Wibke Düsterhöft-Wriggers, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Linear extended formulations are a fundamental tool in integer programming and combinatorial optimization, since they allow to reduce an optimization problem over a polyhedron to an analogous one over a polyhedron that linearly projects to . When can be described with much less inequalities than (typically, polynomial vs. exponential in the dimension of ), this leads to a computational speedup. as above is called an extension of , any set of linear inequalities describing is an extended formulation, and the minimum number of inequalities in an extended formulation for is called the extension complexity of , and denoted by . Computing or bounding the extension complexity of polytopes has been an important topic in recent years, see e.g. [chan2016approximate, fiorini2012linear, rothvoss2017matching].

Lower bounds on extension complexity are usually unconditional: neither they rely on any complexity theory assumptions, nor they take into account the time needed to produce the extension or the encoding length of coefficients in the inequalities. Upper bounds are often constructive and produce an extended formulation in time polynomial (often linear) in its size. Examples of the latter include Balas’ union of polytopes, reflection relations, and branched polyhedral branching systems (see e.g. [conforti2010extended, kaibel2011extended]) .

The fact that we can construct extended formulations efficiently is crucial, since their final goal is to make certain optimization problems (more) tractable. It is interesting to observe that there is indeed a gap between the existence of certain extended formulations, and the fact that we can construct them efficiently: for instance in [bazzi2018no], it is shown that there is a small extended formulation for the stable set polytope that is -approximated, but we do not expect to obtain it efficiently because of known hardness results [haastad2001some]. In another case, proof of the existence of a subexponential formulation with integrality gap for min-knapsack [bazzi2017small] predated its efficient construction [fiorini1711strengthening].

In this paper, we investigate the efficiency of an important tool for producing extended formulation: communication protocols. In a striking result, Yannakakis [yannakakis1991expressing] showed that a deterministic communication protocol computing the slack matrix of a polytope can be used to produce an extended formulation for . The number of inequalities of the latter is at most , where is the complexity of the protocol (see Section 2 for definitions). Hence, deterministic protocols can be used to provide upper bounds on extension complexity of polytopes. This reduction is constructive, but not efficient. Indeed, it produces an extended formulation with variables, inequalities, and an equation per row of . Basic linear algebra implies that most equations are redundant, but in order to obtain a basis we may have to go through the full (possibly exponential-size) list. The main application of Yannakakis’ technique is arguably given in his original paper, and deals with the stable set polytope of perfect graphs. This is a class of polytopes that has received much attention in the literature [Chvatal75, grotschel1984polynomial]. They also play an important role in extension complexity: while many open problems in the area were settled one after the other [chan2016approximate, fiorini2012linear, kaibel2013constructing, rothvoss2017matching], we still do not know if the stable set polytope of perfect graphs has polynomial-size extension complexity. Yannakakis’ protocol gives an upper bound of , while the best lower bound is as small as  [aprile2017extension]. On the other hand, a maximum stable set in a perfect graph can be computed efficiently via a polysize semidefinite extension known as Lovasz’ Theta body [lovasz1979shannon]. This can also be used, together with the ellipsoid method, to efficiently find a coloring of a perfect graph, see e.g. [schrijver2002combinatorial, Section 67.1]. We remark that designing a combinatorial (or at least SDP-free) polynomial-time algorithm to find a maximum stable set in perfect graphs, or to color them, is a main open problem [chudnovsky2015coloring].

Our results. In this paper, we investigate conditions under which we can explicitly obtain an extended formulation from a communication protocol in time polynomial in the size of the formulation itself. We first show a general algorithm that achieves this for any deterministic protocol, given a compact representation of the protocol as a labelled tree and of certain extended formulations associated to leaves of the protocol. The algorithm runs in linear time in the input size and is flexible, in that it also handles non-exact extended formulations. We then show that in some cases one can obtain those extended formulations directly, without relying on this general algorithm. This may be more interesting computationally. We show applications of our techniques in the context of (not only) perfect graphs. Our most interesting application is to Yannakakis’ original protocol, whose associated extended formulation we construct in time , hence linear in the size of the formulation itself. For perfect graphs, this gives a subexponential SDP-free algorithm that computes a maximum stable set (resp. an optimal coloring). For general graphs, this gives a new relaxation of the stable set polytope which is (strictly) contained in the clique relaxation. Finally, we extend our result to obtain extended formulations from unambiguous non-deterministic protocols.

## 2 Preliminaries

Communication protocols. We start by describing the general setting of communication protocols, referring to [kushilevitz1996communication] for details. Let be a matrix with row (resp. column) set (resp. ), and two agents Alice and Bob. Alice is given as input a row index , Bob a column index , and they aim at determining by exchanging information according to some pre-specified mechanism, that goes under the name of protocol. The latter is said to compute if, for any input of Alice and of Bob, it returns ; it is deterministic if the actions of Alice (resp. Bob) at any given step only depend on her (resp. his) input and on what they exchanged so far. Such a protocol can be modelled as a rooted tree, with each vertex modelling a step where one of Alice or Bob sends a bit (hence labelled with or ), and its children representing subsequent steps of the protocol. The leaves of the tree indicate the termination of the protocol and are labelled with the corresponding output. The tree is therefore binary, with each edge representing a 0 or a 1 sent. Hence, a deterministic protocol can be identified by the following parameters: a rooted binary tree with node set ; a function (“Alice”,“Bob”) associating each vertex to its type; for each leaf , a positive number corresponding to the value output at ; for each that is not a leaf and such that (resp. ) the set of inputs (resp. ) such that Alice (resp. Bob) sends a at node . We represent this succinctly by .

An execution of the protocol is a path of from the root to a leaf, whose edges correspond to the bits sent during the execution. The inputs that lead to the same leaf correspond to entries of with the same value . They form a submatrix of with constant values. Such submatrices are called monochromatic rectangles (The term monochromatic distinguishes them from a generic rectangle, which is any submatrix of ). The complexity of the protocol is given by the height of the tree . A deterministic protocol computing gives a partition of in at most monochromatic rectangles. We remark that one can obtain a protocol (and a partition in rectangles) for given a protocol for by just exchanging the roles of Alice and Bob.

The setting of non-deterministic protocols is similar as before, but now Alice and Bob are allowed to make guesses in their communication, with the requirement that, at the end of the protocol, they can both independently verify that the outcome corresponds to for at least one guess made during the protocol. A non-deterministic protocol is called unambiguous if for any input , exactly one guess allows to verify the value of . The complexity of a non-deterministic protocol is the maximum (over all inputs and guesses) amount of bits exchanged during the protocol. Non-deterministic protocols of complexity provide a cover of with at most monochromatic rectangles, which is a partition in the case the protocol is unambiguous. Moreover, each partition of in monochromatic rectangles corresponds to an unambiguous protocol of complexity , where Alice guesses the rectangle covering .

We want to mention another class of communication protocols that is relevant to extended formulations, namely randomized protocols that compute a (non-negative) matrix in expectations. These generalize both deterministic and non-deterministic protocols and have been defined in [faenza2015extended], where they are shown to be equivalent to non-negative factorizations (see the next section) and to essentially capture the notion of extension complexity. In fact, every extended formulation is obtained from a simple randomized protocol, see again [faenza2015extended]. Because of the generality of the notion, it seems hard to obtain a general algorithm as in Theorem 5.

Extended formulations and how to find them. We follow here the framework introduced in [pashkovich2012extended], that extends [yannakakis1991expressing]. Consider a pair of polytopes with , where has rows. A polyhedron is an extension for the pair if there is a projection such that . An extended formulation for is a set of linear inequalities describing as above, and the minimum number of inequalities in an extended formulation for is its extension complexity. The slack matrix of the pair is the non-negative matrix with , where is the -th row of . A non-negative factorization of is a pair of non-negative matrices such that . The non-negative rank of is the smallest intermediate dimension in a non-negative factorization of .

###### Theorem 1.

[pashkovich2012extended] [Yannakakis’ Theorem for pairs of polytopes] Given a slack matrix of a pair of polytopes of dimension at least , the extension complexity of is equal to the non-negative rank of . In particular, if is a non-negative factorization of , then

Hence, a factorization of the slack matrix of intermediate dimension gives an extended formulation of size (i.e., with inequalities). However such formulation has as many equations as the number of rows of .

Now assume we have a deterministic protocol of complexity for computing . The protocol gives a partition of into at most monochromatic rectangles. This implies that , where and each is a rank 1 matrix corresponding to a monochromatic rectangle of non-zero value. Hence can be written as a product of two non-negative matrices of intermediate dimension , where if the (monochromatic) rectangle contains row index and 0 otherwise, and is equal to the value of if contains column index , and 0 otherwise. As a consequence of Theorem 1, this yields an extended formulation for . In particular, let be the set of rectangles of produced by the protocol and, for , let be the set of rectangles whose row index set includes . Then the following is an extended formulation for :

 aix+∑R∈RiyR=bi ∀i=1,…,m (1) y≥0

Again, the formulation has as many equations as the number of rows of , and it is not clear how get rid of non-redundant equations efficiently. Note that all definitions and facts from this section specialize to those from [yannakakis1991expressing] for a single polytope when .

Stable set polytope and . The stable set polytope

is the convex hull of the characteristic vectors of stable (also, independent) sets of a graph

. It has exponential extension complexity [fiorini2012linear, goos2018extension]. The clique relaxation of is:

 QSTAB(G)={x∈Rd+:∑v∈Cxv≤1% for all cliques C of G}. (2)

Note that in (2) one could restrict to maximal cliques, even though in the following we will consider all cliques when convenient. As a consequence of the equivalence between separation and optimization, optimizing over is NP-hard for general graphs, see e.g. [schrijver2002combinatorial]. However, the clique relaxation is exact for perfect graphs, for which the optimization problem is polynomial-time solvable via semidefinite programming (see Section 1):

###### Theorem 2 ([Chvatal75]).

A graph is perfect if and only if .

The following result from [yannakakis1991expressing] is crucial for this paper.

###### Theorem 3.

Let be a graph with vertices. There is a deterministic protocol of complexity computing the slack matrix of the pair . Hence, there is an extended formulation of size for .

We remark that, when is perfect, Theorem 3 gives a quasipolynomial size extended formulation for . However, as discussed above, it is not clear how to obtain such formulation in subexponential time.

## 3 A general approach

We present here a general technique to explicitly and efficiently produce extended formulations from deterministic protocols, starting with an informal discussion.

It is important to address the issue of what is our input, and what assumptions we need in order to get an “efficient” algorithm. Recall that, in our setting, the matrix describing is thought as being exponential in size, while is polynomial (or quasipolynomial). We assume that we have an implicit representation of our polytope of interest, and in particular of . This assumption is natural as, without it, we can hardly imagine to have any useful protocol for the slack matrix of . As an example, consider the case, of the stable set polytope of perfect graphs, of which we know the vertices and inequalities without of course having to explicitly list them (as that would take exponential time).

Recall that a deterministic protocol is identified with a tuple . While we can assume that are given to us explicitly, the sets have in general exponential size. Hence we assume to have an implicit description of them, in particular of our rectangles : notice that the latter correspond to leaves of and can be identified by a sequence of bits exchanged during the protocol. Knowing the structure of our protocol gives us an implicit representation of for each : again, this is a reasonable and basic assumption for approaching the formulation (1) from an algorithmic point of view.

The natural approach to reduce the size of (1) is to eliminate redundant equations. However, the structure of the coefficient matrix depends both on and on rectangles ’s of the factorization, which can have a complex behaviour. The reader is encouraged to try e.g. on the extended formulations obtained via Yannakakis’ protocol for , perfect: the sets ’s have very non-trivial relations with each other that depend heavily on the graph, and we did not manage to directly reduce the system (1) for general perfect graphs. Theorem 5 shows how to bypass this problem. Informally, we shift the problem of eliminating redundant equations from the system (1) to a family of systems , one for each rectangle produced by the protocol, where is a single variable. The latter systems can still have exponential size, but they may be much easier to deal with since each of them has only one extra variable.

We now switch gears and make the discussion formal. Let us start by recalling a well-known theorem from Balas [balas1979disjunctive], in a version given by Weltge ([weltge2015sizes], Section 3.1.1).

###### Theorem 4.

Let be polytopes, with , where is a linear map, for . Let . Then we have:

 P={x∈Rd: ∃y1∈Rm1,y2∈Rm2,λ∈R:x=π1(y1)+π2(y2), A1y1≤λb1,A1y2≤(1−λ)b2,0≤λ≤1}.

Moreover, the inequality ( respectively) is redundant if () has dimension at least 1. Hence .

We now give the main theorem of this section. Note that, while the result relies on the existence of a deterministic protocol , its complexity does not depend on the encoding of and (see the previous section).

###### Theorem 5.

Let be a slack matrix for a pair , where and for are polytopes and for , let (resp. ) be a valid upper bound (resp. lower bound) on variable in . Assume there exists a deterministic protocol with complexity computing , and let be the set of monochromatic rectangles in which it partitions (hence ). For , let is a column of and row of .

Suppose we are given and for each an extended formulation for . Let be the size (number of inequalities) of , and be the total encoding length of the description of (including the number of inequalities, variables and equations). Then we can construct an extended formulation for of size linear in in time linear in .

###### Proof.

We can assume without loss of generality that is a complete binary tree, i.e. each node of the protocol other than the leaves has exactly two children. Let be the set of nodes of and . Note that there exists exactly one (non-necessarily monochromatic) rectangle associated to , which is given by the pairs such that, on input , the execution of the protocol visits node . Let us define, for any such , a pair with and

 Qv={x∈Rd:aix≤bi∀i row of Sv; ℓj≤xj≤uj for all j∈[d]}.

Clearly , and is a polytope. Moreover, , and for the root of . We now show how to obtain an extended formulation for the pair given extended formulations ’s for , , where (resp. ) are the two children nodes of in .

Assume first that is labelled . Then we have (up to permutation of rows), since the bit sent by Alice at splits in two rectangles by rows – those corresponding to rows where she sends and those corresponding to rows where she sends . Therefore and . Hence we have , where is a projection from the space of to . An extended formulation for can be obtained efficiently by juxtaposing the formulations of .

Now assume that is labelled . Then similarly we have (up to permutations of columns). Hence, and , which implies . An extended formulation for can be obtained efficiently by applying Theorem 4 to the formulations of . Iterating this procedure, in a bottom-up approach we can obtain an extended formulation for from extended formulations of , for each leaf of the protocol.

We now bound the number of basic operations necessary to obtain our formulation. If , then . Consider now . From Theorem 4 we have . Since the binary tree associated to the protocol is complete, it has size linear in the number of leaves, hence for the final formulation we have

 σ+(Tρ)≤∑R∈Rσ+(TR)+O(d)=O(∑R∈Rσ+(TR)),

where the last equation is justified by the fact that we can assume for any . The bound on the size of is derived in an analogous way. ∎

A couple of remarks on Theorem 5 are in order. The formulation that is produced may not have exactly the form given by the corresponding protocol. Also, even for the special case , the proof relies on the version of Yannakakis’ theorem for pairs of polytopes. On the other hand, it does not strictly require that we reach leaves of the protocol – a similar bottom-up approach would work starting at any node , as long as we have an extended formulation for .

The reader may recognize similarities between the proof of Theorem 5 and that of the main result in [fiorini1711strengthening], where a technique is given to construct extended formulations for polytopes using Boolean formulas. While similar in flavour, those two results seem incomparable, in the sense that one does not follow from the other. It is possible however that they both fall under a more general common framework.

Application to . We now describe how to apply Theorem 5 to the protocol from Theorem 3 as to obtain an extended formulation for in time . In particular, this gives an extended formulation for , perfect within the same time bound.

We first give a modified version of the protocol from [yannakakis1991expressing], stressing a few details that will be important in the following. The reader familiar with the original protocol can immediately verify its correctness. Let be the vertices of in any order. At the beginning of the protocol, Alice is given a clique of as input and Bob a stable set , and they want to compute the entry of the slack matrix of corresponding to , i.e. to establish whether intersect or not.

At each stage of the protocol, the vertices of the current graph are partitioned between low degree (i.e. at most ) and high degree . Suppose first . Alice sends (i) the index of the low degree vertex of smallest index in , or (ii) if no such vertex exists. In case (i), if , then and the protocol ends; else, is replaced by , where denotes the subgraph of induced by . In case (ii), if Bob has no high degree vertex, then and the protocol ends, else, is replaced by . If conversely , then the protocol proceeds symmetrically to above: Bob sends (i) the index of the high degree vertex of smallest index in , or (ii) if no such vertex exists. In case (i), if , then and the protocol ends; else, is replaced by . In case (ii), if Alice has no low degree vertex, then and the protocol ends, else, is replaced by . Note that at each step the number of vertices of the graph is decreased by at least half, and and do not intersect in any of the vertices that have been removed.

Now let be the slack matrix of the pair . Each monochromatic rectangle in which the protocol from Theorem 3 partitions is univocally identified by the list of cliques and of stable sets corresponding to its rows and columns. With a slight abuse of notation, for a clique (resp. stable set ) whose corresponding row is in , we write (resp. ), and we also write . We let be the convex hull of stable sets and the set of clique inequalities corresponding to cliques , together with the unit cube constraints.

We need a fact on the structure of , for which we introduce some more notation: for a (monochromatic) rectangle , let be the set of vertices sent by Alice and the set of vertices sent by Bob during the corresponding execution of the protocol. Note that is a clique and is a stable set.

###### Observation 6.

For a , there is exactly one clique and one stable set of such that and . Conversely, given a clique and a stable set , there is at most one rectangle such that and . Notice that for any .

Now, assuming we are given the graph as input, in order to apply Theorem 5 we need to perform two steps:

1. Obtain the tree with label set deriving from the protocol for .

A simple way is to first enumerate all cliques and stable sets of of size at most and run the protocol on each possible input pair to get (thanks to Observation 6). Then, derive the structure of (and the labels ) from the rectangles obtained: for instance, all the rectangles whose begins with vertex are descendants of the child of the root whose edge is labelled , etc.

2. For each leaf of corresponding to a rectangle , give a compact extended formulation for the pair .

Fix to be a 1-rectangle (i.e. a monochromatic rectangle of value 1), the -rectangle case being similar. Since is a non-negative rank-1 matrix, an extended formulation of is given by

 {x∈Rd,yR∈R:x(C)+yR=1∀C∈R, yR≥0,0≤x≤1}. (3)

We now reduce the equations in the description above, which can be exponentially many, to a smaller system. For that we need the following fact on the structure of the rectangles.

###### Lemma 7.

Let and . Then for any such that and any such that we have .

###### Proof.

Note that a vertex is not sent during the protocol on input . Hence, the execution of the protocols on inputs and coincides. Indeed at every step Alice chooses the first vertex of low degree in her current clique, and if is never chosen, having in the clique does not affect her choice. Moreover, the choice of Bob only depends on his current stable set and the vertices previously sent by Alice. In particular, we have . Iterating the argument (and applying the symmetric for ) we conclude the proof. ∎

Now, we claim that

 TR={x∈Rd,yR∈R: x(CR)+yR=1 x(CR+v)+yR=1 ∀v∈V∖CR:CR+v∈R yR≥0 0≤x≤1}.

is an extended formulation for . It suffices to show that the coefficient vector of each equation from (3) is spanned by the coefficient vectors from equations in the formulation above. Let . For any , we have thanks to Lemma 7. Hence we obtain:

 ∑v∈C∖CR(x(CR+v)+yR)−(|C∖CR|−1)(x(CR)+yR)=x(C)+yR,

as required.

We conclude by observing that the approach described above proceeds by obtaining the leaves of , with an expensive enumeration of cliques and stable sets, and then it reconstructs . This takes time . However, one could instead try to construct from the root, by distinguishing cases for each possible bit sent by Alice or Bob. This intuition is the basis for the alternative formulation that we give in the next section.

## 4 Applications and direct derivations

Complement graphs. An extended formulation for , perfect, can be efficiently obtained from an extended formulation of ), keeping a similar dimension (including the number of equations). We use the following two known facts.

###### Lemma 8 ([schrijver2002combinatorial]).

is a perfect graph if and only if .

###### Lemma 9 ([Martin91, weltge2015sizes]).

Given a non-empty polyhedron and , let . If , then we have that

 P={x:∃λ≥0, μ:ATλ+CTμ=x,BTλ+DTμ=0, bTλ+dTμ≤γ}.

Hence .

Next fact then follows immediately.

###### Corollary 10.

Let be a perfect graph on vertices such that admits an extended formulation with additional variables (i.e. variables in total), inequalities and equations. Then admits an extended formulation with additional variables, inequalities, equations, which can be written down efficiently given .

, perfect. We now present an algorithm that, given a perfect graph on vertices, produces an explicit extended formulation for of size , in time bounded by . The algorithm is based on a decomposition approach inspired by Yannakakis’ protocol [yannakakis1991expressing], even though the formulation obtained has a different form than (1) (and different from the formulation obtained in the previous section). A key tool is next Lemma. For a vertex of , denotes the inclusive neighbourhood of .

Our algorithm can be seen as performing breadth-first search on the tree corresponding to the protocol for (as described in the previous section), and iteratively decomposing according to the non-leaf vertices met. Each vertex sent by Alice (resp. Bob) corresponds to a partition of the rows (resp. columns) of the slack matrix: informally, in our algorithm we model Alice’s partitions by decomposing , and Bob’s partitions by first taking the complementary graphs (intuitively, swapping cliques and stable sets, and the role of Alice and Bob) and then decomposing. This goes on until the subgraphs obtained are small enough for us to get a formulation for their stable set polytopes in constant time, after which we go bottom up and iteratively reconstruct the formulation for similarly as done in Theorem 5.

###### Lemma 11.

Let be a perfect graph on vertex set , and, fix with . Let be the induced subgraph of on vertex set for , and the induced subgraph of on vertex set . Then we have .

###### Proof.

We first remark that, since by definition for , and , we have . Recall that , where is the set of cliques of . Let us consider the partition of where a clique is in , for , if , and is in otherwise. Let . We have that if and only if for , with denoting the restriction of to coordinates in . We now claim that for . Notice that implies that is a clique in , which proves the “” inclusion. For the opposite inclusion, since each , being an induced subgraph of , is perfect, it suffices to show that includes all the maximal cliques of . Let be a maximal clique of . If , then and we are done. If , then as , and , which concludes the proof of the claim. Hence we have that if and only if , and for any clique in , but this is equivalent to . ∎

A simple though important observation for our approach is the following.

###### Observation 12.

Let be polyhedra with , and let be an extended formulation for for , i.e. . Then .

###### Theorem 13.

Let be a perfect graph on vertices. There is an algorithm that, on input , outputs an extended formulation of of size in time.

###### Proof.

We argue by induction on . In this proof, logarithms are in base (note that this was not needed earlier, because of big notation). The base cases for bounded by a constant are trivial, as the size of the classical formulation (and the time to obtain it) is constant too. For general , let be the vertices of with degree at most and be defined as in Lemma 11. First, assume that , hence have all size at most . By induction, and thanks to Lemma 11 and Observation 12, running the algorithm on and then applying Lemma 11, we produce an extended formulation of from those of of size at most for some constant , but this is at most (assuming without loss of generality that ). The same bound holds for the total running time.

If , we take the complement graph , hence swapping low degree and high degree vertices: we now have , hence by the previous case the algorithm obtains a formulation of of size at most . We can then use Lemma 9 to efficiently obtain a formulation for , which by Corollary 10 has size at most . Similar calculations bound the number of variables and equations of the formulation. The same bound holds for the total running time. ∎

Extension to non-perfect graphs and general decomposition trees. Although in the previous section we restricted ourselves to perfect graphs for ease of exposition, it is not hard to show that the above algorithm can be used on general graphs, yielding an extended formulation of of quasipolynomial size. Moreover, one can notice that the correctness of the algorithm does not depend on the decomposition procedure chosen: informally, one can define an arbitrary decomposition tree whose root corresponds to and whose nodes correspond to either decomposing the graph as in Lemma 11 (with any choice of vertices ) or taking the complement graph. Using such tree and proceeding similarly as above we can obtain an extended formulation whose size only depends on the size of the tree and on the formulations we have for the subgraphs corresponding to leaves of the tree. We will see an application of this to threshold-free graphs (Theorem 17).

###### Theorem 14.

Let be a graph on vertices. Then there is an algorithm that, on input , outputs an extended formulation of of size in time.

###### Proof.

The decomposition scheme outlined in the proof of Theorem 13 can be associated to a decomposition tree on nodes as follows: at each step, either decompose the current graph using Lemma 11 (in which case each children is one of the ’s), or take the complement (in which case there is a single child, associated to ). We will abuse notation and identify a node of the decomposition tree and the corresponding subgraph. Note, in particular, that this decomposition tree does not depend on the fact that is perfect, hence it can be applied here as well.

We can assume that, for each leaf of , we are given an extended formulation of . Consider the extended formulation, which we call , obtained by traversing bottom-up and applying the following:

1. if a non-leaf node of has a single child , then define to be equal to the extended formulation of , obtained by applying Lemma 9.

2. otherwise if has children , then we define to be the extended formulation obtained from using Observation 12.

We only need to prove that is an extended formulation for , as the efficiency aspects have been discussed in Theorem 5 and 13. We proceed by induction on the height of , in particular we prove that for a node of , is an extended formulation of , assuming this is true for the children of . If is a leaf of , then there is nothing to prove. Otherwise, we need to analyze two cases:

1. has a single child, labelled . Assume that , where is the projection on the appropriate space. Let be a stable set in , and the corresponding incidence vector. For any , we have as is a clique in , hence since (hence ) is clearly convex it follows that . Now, for a clique of and , one has hence , proving .

2. has children . Assume that for , . Let be the characteristic vector of a stable set of , for every is a stable set in , hence , and by convexity we conclude again that . Finally, let , and let be a clique of . It follows from the way we decompose , that is contained in for some : indeed, if , let be the minimum such that , then by definition; if , then . By induction hypothesis, . But then , and since this holds for all the cliques of , we have .

Theorem 14, generalizes Theorem 13 and, we believe, can have interesting computational consequences: indeed, there is interest in producing relaxations of without explicitly computing the Theta body (see for instance [giandomenico2015ellipsoidal]).

Claw-free perfect graphs and generalizations. Let where is a claw-free, perfect graph on vertices. As is perfect, the (non-trivial part of the) slack matrix of is the clique vs stable set incidence matrix of , and can be computed by the following protocol, given in [faenza2015extended]. Alice, who has a clique as input, sends a vertex to Bob, who has a stable set . Now, since is claw-free, we have , and clearly , hence Bob can just send and Alice knows the intersection . The protocol has complexity at most hence by applying Theorem 1 we get the following formulation of size :

 x(C)+∑R∈RCyR=1 ∀C clique of G (4) y≥0

Where contains a (monochromatic) rectangle for each couple , where and with stable (i.e., ), and , following the notation used in (1), denotes the set of rectangles including the row index corresponding to . Notice that, for the rectangles in to partition the slack matrix of , we need to specify which vertex is sent from Alice given a certain clique as input: for this we can simply fix an order of the vertices and assume that Alice sends the vertex of her clique that is first in the order. Hence the rectangles in have form where is the “first” vertex of . We now derive a more compact formulation than (4), getting rid of provably redundant equations. Before, we notice that the above protocol can be easily generalized to perfect -free graphs for : in this case the sets is defined similarly as before, except that now we have rectangles with . This yields a formulation of size . We state our result for this more general class of graphs: informally, the only clique equations that we keep are coming from singletons and edges, obtaining a formulation with only many equations.

###### Theorem 15.

Let be a perfect and -free graph. Let as above. Then the following is an extended formulation for :

 x(v)+∑R∈RvyR=1 ∀v∈V (5) x(e)+∑R∈ReyR=1 ∀e∈E y≥0
###### Proof.

Thanks to the above discussion, we only need to show that, for any clique of with , the equation is implied by the equations in (5). From now on, fix such and let be the first vertex of (in the order fixed by the protocol) and consider the following expression, obtained by summing the non-constant part of the equations relative to , for every :

 ∑e=uv:u∈C−v(x(e)+∑R∈ReyR)=
 (k−2)x(v)+x(C)+∑e=uv:u∈C−v∑R∈Re∩RCyR+∑R∈Re∖RCyR

Now, recall the slack matrix of has 0/1 entries and a 1-rectangle is determined by a couple , where is a vertex sent by Alice and is the set of vertices sent by Bob. If the rectangle covers a 1-entry , then is the first vertex of and , with (as otherwise would be a 0-entry). Hence, we can derive for , and , where denotes the family of the stable sets of . Hence for . We can then rewrite the above expression as:

 (k−2)x(v)+x(C)+(k−1)∑R∈RCyR+∑e=uv:u∈C