A large number of important algorithmic tasks can be construed as constraint satisfaction problems (CSPs): finding an assignment to Boolean variables to optimize the number of satisfied constraints. Almost every form of constraint optimization is -complete; thus one is led to questions of efficiently finding near-optimal solutions, or understanding the complexity of average-case rather than worst-case instances. Indeed, understanding the complexity of random sparse CSPs is of major importance not just in traditional algorithms theory, but also in, e.g., cryptography [JLS20], statistical physics [MM09], and learning theory [DLS14].
Suppose we fix the model for a random sparse CSP on variables (e.g., random -SAT with a certain clause density). Then it is widely believed that there should be a constant such that the optimal value of a random instance is with high probability (whp). (Here we define the optimal value to mean the maximum number of simultaneously satisfiable constraints, divided by the number of variables.) Unfortunately, it is extremely difficult to prove this sort of result; indeed, it was considered a major breakthrough when Bayati, Gamarnik, and Tetali [BGT13] established it for one of the simplest possible cases: Max-Cut on random -regular graphs (which we will denote by ). Actually “identifying” the value
(beyond just proving its existence) is even more challenging. It is generally possible to estimate
using heuristic methods from statistical physics, but making these estimates rigorous is beyond the reach of current methods. Taking again the example of Max-Cut on-regular random graphs, it was only recently [DMS+17] that the value was determined up to a factor of . The value for any particular , e.g. , has yet to be established.
Returning to algorithmic questions, we can ask about the computational feasibility of optimally solving sparse random CSPs. There are two complementary questions to ask: given a random instance from model (with presumed optimal value ), can one efficiently find a solution achieving value , and can one efficiently certify that every solution achieves value ? The former question is seemingly a bit more tractable; for example, a very recent breakthrough of Montanari [MON21] gives an efficient algorithm for (whp) finding a cut in a random graph graph of value at least . On the other hand, we do not know any algorithm for efficiently certifying (whp) that a random instance has value at most . Indeed, it reasonable to conjecture that no such algorithm exists, leading to an example of a so-called “information-computation gap”.
To bring evidence for this we can consider semidefinite programming (SDP), which provides efficient algorithms for certifying an upper bound on the optimal value of a CSP [FL92]. Indeed, it is known [RAG09] that, under the Unique Games Conjecture, the basic SDP relaxation provides essentially optimal certificates for CSPs in the worst case. In this paper we in particular consider Boolean 2CSPs — more generally, optimizing a homogeneous degree- polynomial over the hypercube — as this is the setting where semidefinite programming is most natural. Again, for a fixed model of random sparse Boolean 2CSPs, one expects there should exist a constant such that the optimal SDP-value of an instance from is whp . Philosophically, since semidefinite programming is doable in polynomial time, one may be more optimistic about proving this and explicitly identifying . Indeed, some results in this direction have recently been established.
1.1 Prior work on identifying high-probability SDP values
Let us consider the most basic case: , Max-Cut on random -regular graphs.
For ease of notation, we will consider the equivalent problem of maximizing over , where is the adjacency matrix of a random -vertex -regular graph.111Throughout this work, boldface is used to denote random variables.
is used to denote random variables.Although , the high-probability SDP relaxation value, was pursued as early as 1987 [BOP87] (see also [DH73]), it was not until 2015 that Montanari and Sen [MS16] established the precise result . That is, in a random -regular graph, whp the basic SDP relaxation value [BOP87, RW95, DP93, PR95] for the size of the maximum cut is . Here the special number
is the maximum eigenvalue of the-regular infinite tree.
The proof of this result has two components: showing whp, and showing whp. Here denotes the “primal” SDP value on matrix (commonly associated with the Goemans–Williamson rounding algorithm [GW95]), and denotes the (equal) “dual” SDP value on . To show the latter bound, it is sufficient to observe that , the “eigenvalue bound”, and whp by Friedman’s Theorem [FRI08]. As for lower-bounding , Montanari and Sen used the “Gaussian wave” method [ELO09, CGH+15, HV15] to construct primal SDP solutions achieving at least
(whp). The idea here is essentially to build the SDP solutions using an approximate eigenvector (of finite support) of the infinite-regular tree achieving eigenvalue ; the fact that SDP constraint “” can be satisfied relies heavily on the regularity of the graph.
The Montanari–Sen result in passing establishes that the (high-probability) eigenvalue and SDP bounds coincide for random regular graphs. This is consistent with a known theme, that the two bounds tend to be the same (or nearly so) for graphs where “every vertex looks similar” (in particular, for regular graphs). This theme dates back to Delorme and Poljak [DP93], who showed that whenever is the adjacency matrix of a vertex-transitive graph.
Subsequently, the high-probability SDP value was established for a few other models of random regular 2CSPs. Deshpande, Montanari, O’Donnell, Schramm, and Sen [DMO+19] showed that for — meaning random regular instances of NAE-3SAT (not-all-equals 3Sat) with each variable participating in clauses — we have . We remark that NAE-3SAT is effectively a 2CSP, as the predicate may be expressed as , supported on the “triangle” formed by variables . The analysis in this paper is somewhat similar to that in [MS16], but with the infinite graph ( times) replacing the -regular infinite tree. This is the -regular infinite “tree of triangles” depicted (partly, in the case ) in LABEL:fig:c3c3c3. More generally, [DMO+19] established the high-probability SDP value for large random (edge-signed) graphs that locally resemble , the -regular infinite “tree of cliques ”. (The case essentially generalizes [MS16].) As in [MS16], coincides with the (high-probability) eigenvalue bound. The upper bound on is shown by using Bordenave’s proof [BOR20] of Friedman’s Theorem for random -biregular graphs. The lower bound on is shown using the Gaussian wave technique, relying on the distance-regularity of the graphs (indeed, it is known that every infinite distance-regular graph is of this form).
.255C3C3C3.pngThe -regular infinite graph , modeling random -regular NAE3-SAT.fig:c3c3c3
Mohanty, O’Donnell, and Paredes [MOP20] generalized the preceding two results to the case of “two-eigenvalue” 2CSPs. Roughly speaking, these are 2CSPs formed by placing copies of a small weighted “constraint graph” — required to have just two distinct eigenvalues — in a random regular fashion onto vertices/variables. (This is indeed a generalization [DMO+19], as cliques have just two distinct eigenvalues.) As two-eigenvalue examples, [MOP20] considered CSPs with the “CHSH constraint” — and its generalizations, the “Forrelation” constraints — which are important in quantum information theory [CHS+69, AA18]. Here the SDP value of an instance is particularly relevant as it is precisely the optimal “quantum entangled value” of the 2CSP [CHT+04]. Once again, it is shown in [MOP20] that the high-probability SDP and eigenvalue bounds coincide for these types of CSPs. The two-eigenvalue condition is used at a technical level in both the variant of Bordenave’s theorem proven for the eigenvalue upper bound, and in the Gaussian wave construction in the SDP lower bound.
for very wide range of random 2CSPs, encompassing all those previously mentioned: namely, any quadratic optimization problem defined by random “matrix polynomial lifts” over literals.
1.2 Our work
In this work, we establish the high-probability SDP value for random instances of any 2CSP model arising from lifting matrix polynomials (as in [OW20]). This generalizes all previously described work on SDP values, and covers many more cases, including random lifts of any base 2CSP and random graphs modeled on any free/additive/amalgamated product. Such graphs have seen numerous applications within theoretical computer science, for example the zig-zag product in derandomization (e.g. [RVW02]) and lifts of 2CSPs in the study of the stochastic block model (e.g [BBK+20]). See Section 2 for more details and definitions, and see [OW20] for a thorough description of the kinds of random graph/2CSP models that can arise from matrix polynomials.
Very briefly, a matrix polynomial is a small, explicit “recipe” for producing random -vertex edge-weighted graphs, each of which “locally resembles” an associated infinite graph . For example, is a recipe for random (edge-signed) -regular -vertex graphs, and here is the infinite -regular tree. As another example, if denotes the following matrix polynomial —
— then is a recipe for random (edge-signed) -regular -vertex graphs where every vertex participates in triangles.
In this case, is the infinite graph (partly) depicted in LABEL:fig:c3c3c3.
The Bordenave–Collins theorem [BC19] shows that if is the adjacency matrix of a random unsigned -vertex graph produced from a matrix polynomial , then whp the “nontrivial” spectrum of will be within (in Hausdorff distance) of the spectrum of .
In the course of derandomizing this theorem, O’Donnell and Wu [OW20] established that for random edge-signed graphs, the modifier “nontrivial” should be dropped.
As a consequence, in the signed case one gets up to an additive , whp; i.e., the high-probability eigenvalue bound for CSPs derived from is precisely .
We remark that for simple enough there are formulas for ; regarding our two example above, it is for , and it is for .
In particular, if is a linear matrix polynomial, may be determined numerically with the assistance of a formula of Lehner [LEH99] (see also [GK21] for the case of standard random lifts of a fixed base graph).
In this paper we investigate the high-probability SDP value — denote it — of a large random 2CSP (Boolean quadratic optimization problem) produced by a matrix polynomial .
Critically, our level of generality lets us consider non-regular random graph models, in contrast to all previous work.
Because of this, see we cases in contrast to Section 1.1, where (whp) the SDP value is strictly smaller than the eigenvalue relaxation bound.
As a simple example, for random edge-signed -biregular graphs, the high-probability eigenvalue bound is , but our work establishes that the high-probability SDP value is .
An essential part of our work is establishing the appropriate notion of the “SDP value” of an infinite graph , with adjacency operator . While the eigenvalue bound makes sense for the infinite-dimensional operator , the SDP relaxation does not. The definition does not make sense, since “” is . Alternatively, if one tries the normalization , any such will have infinite trace, and hence may be . Indeed, since the only control we have on ’s “size” will be an operator norm (“-norm”) bound, the expression is only guaranteed to make sense if is restricted to be trace-class (i.e., have finite “-norm”).
On the other hand, we know that the eigenvalue bound is too weak, intuitively because it does not properly “rebalance” graphs that are not regular/vertex-transitive. The key to obtaining the correct bound is introducing a new notion, intermediate between the eigenvalue and SDP bounds, that is appropriate for graphs arising from matrix polynomial recipes. Although these graphs may be irregular, their definition also allows them to be viewed as vertex-transitive infinite graphs with matrix edge-weights. In light of their vertex-transitivity, Section 1.1 suggests that a “maximum eigenvalue”-type quantity — suitably defined for matrix-edge-weighted graphs — might serve as the sharp replacement for SDP value. We introduce such a quantity, calling it the partitioned SDP bound. Let be an -vertex graph with matrices as edge-weights, and let be its adjacency matrix, thought of as a Hermitian matrix whose entries are matrices. We will define
where here refers to the matrix obtained by summing the entries on ’s main diagonal (themselves matrices), and denotes the scalar in the -position of . This partitioned SDP bound may indeed be regarded as intermediate between the maximum eigenvalue and the SDP value. On one hand, given an scalar-edge-weighted -vertex graph with adjacency matrix , we may take and then it is easily seen that coincides with . On the other hand, if we regard as a matrix and take (so that we a have single vertex with a self-loop weighted by all of ), then .
As we will see in Section 3, can be suitably defined even for bounded-degree infinite graphs with edge-weights. Furthermore, it has the following SDP dual:
In the technical Section 3.1, we show that there is no SDP duality gap between and , even in the case of infinite graphs. It is precisely the common value of and that is the high-probability SDP value of large random 2CSPs produced from ; our main theorem is the following: Let be a matrix polynomial with coefficients. Let be the adjacency operator (with entries) of the associated infinite lift , and write . Then for any and sufficiently large , if is the adjacency matrix of a random edge-signed -lift of , it holds that except with probability at most .
Note that is a fixed value only dependent on the polynomial , a finitary object.
The upper bound in this theorem can be derived from the results of [BC19, OW20]. Our main work is to prove the lower bound . For this, our approach is inspired by the Gaussian Wave construction of [MS16, DMO+19] for -regular graphs (in the random lifts model), which can be viewed as constructing a feasible solution of value
using a truncated eigenfunction of. Since local neighborhoods in look like local neighborhoods in with high probability, the eigenfunction can be “pasted” almost everywhere into the graph , which gives an SDP solution of value near .
This approach runs into clear obstacles in our setting. Indeed, the raw eigenfunctions are of no use to us, as the SDP value may be smaller than the spectral relaxation value. Instead, we first show that there is a with only finitely many nonzero entries that achieves the in Equation 2 up to . This is effectively a finite matrix-edge-weighted graph. We then show that this can (just as in the regular case) whp be “pasted” almost everywhere into the graph defined by , which gives an SDP solution of value close to . The fact that and are regarded as regular tree-like graphs with matrix edge-weights (rather than as irregular graphs with scalar edges-weights) is crucially used to show that the “pasted solution” satisfies the finite SDP’s constraints “”.
To preface the following definitions and concepts, we remark that our “real” interest is in graphs (or 2CSPs) with real scalar weights. The introduction of matrix edge-weights facilitates two things: helping us define a wide variety of interesting scalar-weighted graphs via matrix polynomial lifts; and, facilitating the definition of , which we use to bound the SDP relaxation value of the associated 2CSPs. Our use of potentially complex matrices is also not really essential; we kept them because prior work that we utilize ([BC19], the tools in Section 3) is stated in terms of complex matrices. However the reader will not lose anything by assuming that all Hermitian matrices are in fact symmetric real matrices.
2.1 Matrix-weighted graphs
In the following definitions, we’ll restrict attention to graphs with at-most-countable vertex sets and bounded degree. We also often use bra-ket notation, with
denoting the standard orthonormal basis for the complex vector space.
[Matrix-weighted graph] Fix any . A matrix-weighted graph will refer to a directed simple graph with self-loops allowed, in which each directed edge has an associated weight . If and , we say that is an undirected matrix-weighted graph. The adjacency matrix of is the operator , acting on , given by
It can be helpful to think to think of in matrix form, as a matrix whose entries are themselves edge-weight matrices. Note that if is undirected if and only if is self-adjoint, .
[Extension of a matrix-weighted adjacency matrix] Given a matrix with entries, we may also view it as a matrix with scalar entries. When we wish to explicitly call attention to the distinction, we will call the latter matrix the extension of , and denote it by .
2.2 Matrix polynomials
[Matrix polynomial] Let be formal indeterminates that are their own inverses, and let be formal indeterminates with formal inverses . For a fixed , we define a matrix polynomial to be a formal noncommutative polynomial over the indeterminates , with coefficients in . In particular, we may write
where the sum is over words on the alphabet of indeterminates, each is in , and only finitely many are nonzero. Here we call a word reduced if it has no adjacent or pairs. We will denote the empty word by .
As we will shortly describe, we will always be considering substituting unitary operators for the ’s, and unitary involution operators for the ’s. Thus we can think of as both the inverse and the “adjoint” of indeterminate , and similarly we think of .
[Adjoint of a polynomial] Given a matrix polynomial as above, we define its adjoint to be
where is the usual adjoint of , and is the adjointed reverse of . That is, if then , where , , and . We say is self-adjoint if formally.
Note that in any self-adjoint polynomial, some terms will be self-adjoint, and others will come in self-adjoint pairs. In this work, we will only be considering self-adjoint polynomials.
2.3 Lifts of matrix polynomials
Given a matrix polynomial over the indeterminates , we define an -lift to be a sequence of matrices, where each is a signed permutation matrix and each is a signed matching matrix.222A signed matching matrix is the adjacency matrix of a perfect matching with edge-signs. If then we must restrict to even .
A random -lift is one where the matrices are chosen independently and uniformly at random.
[Evaluation/substitution of lifts.]
Given an -lift and a word , we define to be the operator obtained by substituting appropriately into : namely, , , and for each (and substituting the empty word with the identity operator).
Given also a matrix polynomial , we define the evaluation of at to be the following operator on :333Note that coefficients are written on the left in , as is conventional, but we take the tensor/Kronecker product on the right so that the matrix form of
, as is conventional, but we take the tensor/Kronecker product on the right so that the matrix form ofmay be more naturally regarded as an matrix with entries.
Note that each is unitary and each a unitary involution (as promised), so . Thus is a self-adjoint operator whenever is a self-adjoint polynomial. In this case we also have that may be viewed as the adjacency matrix of an undirected graph on vertex set with edge-weights.
Note that the evaluation of a matrix polynomial may be viewed as the adjacency matrix of a undirected graph on with edge-weights; or, its extension may be viewed as the adjacency matrix of an undirected graph on with scalar edge-weights. In this way, each fixed matrix polynomial , when applied to a random lift, gives rise to a random (undirected, scalar-weighted) graph model. A simple example is the matrix polynomial
Here and each coefficient is just the scalar . This gives rise to a model of random edge-signed -regular graphs on .
By moving to actual matrix coefficients with , one can get the random (signed) graph model given by randomly -lifting any base -vertex graph . As a simple example,
is the recipe for random -regular (edge-signed) -vertex bipartite graphs. The reader may like to view this as a matrix of polynomials,
but recall that we actually Kronecker-product the coefficient matrices “on the other side”. So rather than as a block-matrix with blocks, we think of the resulting adjacency matrix as an block-matrix with blocks; equivalently, an -vertex graph with matrix edge-weights. The matrix polynomial mentioned in (1) gives an example of a nonlinear polynomial with matrix coefficients. Again, we wrote it there as a matrix of polynomials for compactness, but for analysis purposes we will view it as a degree- polynomial with coefficients.
[-lift] Formally, we extend Section 2.3 to the case of as follows. Let denote the free product of groups , with its components generated by , . Thus the elements of are in one-to-one correspondence with the reduced words over indeterminates . The generators act as permutations on by left-multiplication, with the first in fact being matchings. We write for these permutations, and we also identify them with their associated permutation operators on . Finally, we write for “the” -lift associated to . (Note that this lift is “unsigned”.)
[Evaluation at the -lift, and .] The evaluation of a matrix polynomial at the infinity lift is now defined just as in Footnote 2; the resulting operator operates on . We may think of the result as a matrix-weighted graph on vertex set , and we will sometimes denote this graph by . When is understood, we often write for the adjacency operator of , which can be thought of as an infinite matrix with rows/columns indexed by and entries from , or as its “extension” , an infinite matrix with rows/columns indexed by and scalar entries. For the polynomial , the corresponding graph is the infinite (unweighted) -regular tree.
We may now state a theorem which is essentially the main result (“Theorem 2”) of [BC19]. The small difference is that our notion of random -lifts, which includes signs on the matchings/permutations, lets us eliminate mention of “trivial” eigenvalues (see [OW20, Thms. 1.9, 10.10]). Let be a self-adjoint matrix polynomial with coefficients from on indeterminates . Then for all and sufficiently large , the following holds:
Let , where is a random -lift, and let . Then except with probability at most , the spectra and are at Hausdorff distance at most
2.4 Random lifts as optimization problems
Given a Hermitian (i.e., self-adjoint) matrix , we are interested in the task of maximizing over all Boolean vectors . (Since is Hermitian, the quantity is always real, so this maximization problem makes sense.) This is the same as maximizing the homogeneous degree- (commutative) polynomial over , and it is also essentially the same task as the Max-Cut problem on (scalar-)weighted undirected graphs. More precisely, if is a weighted graph on vertex set with adjacency matrix , then ’s maximum cut is indicated by the that maximizes . For the sake of scaling we will also include a factor of in this optimization problem, leading to the following definition: [Optimal value] Given a Hermitian matrix , we define
(For finite-dimensional , the s and s mentioned in this section are all achieved.) We remark that
where we use the notation . Thus we also have
where is the “cut polytope”, the convex hull of all matrices of the form for . (Since is linear in , maximizing over the convex hull is the same as maximizing over the extreme points, which are just those matrices of the form .)
The above optimization problem has a natural relaxation: maximizing over all unit vectors . This leads to the following efficiently computable upper bound on : [Eigenvalue bound] Given a Hermitian matrix , we define the eigenvalue bound to be
where here denotes that is (Hermitian and) positive semidefinite. The matrices being optimized over in are known as density matrices; i.e., is the maximal inner product between and any density matrix. Note that if , then is a density matrix. Thus, is a relaxation of , or in other words, .
The set of density matrices is convex, and it’s well known that its extreme points are all the rank- density matrices; i.e., those of the form for with . Thus in it is equivalent to just maximize over these extreme points:
From this formula we see that is also equal to , the maximum eigenvalue of ; hence the terminology “eigenvalue bound”. One may also think of and as SDP duals of one another.
We now mention another well known, tighter, upper bound on . [Basic SDP bound] Given a Hermitian matrix , the basic SDP bound is defined to be
Recall that an matrix is a correlation matrix [STY73] if it is PSD and has all diagonal entries equal to . Thus is equivalently maximizing over all correlation matrices . We also note that any cut matrix is a correlation matrix, and any correlation matrix is a density matrix, hence so
[Dual SDP bound] The semidefinite dual of is the following [DP93]:
Despite the fact that the usual “Slater condition” for strong SDP duality fails in this case (because the set of correlation matrices isn’t full-dimensional), one can still show [PR95] that indeed holds for finite-dimensional .
In this work we frequently consider matrix-weighted graphs with adjacency matrices , thought of as matrices with entries from . For such matrices, whenever we write , we mean for the “extension” matrix (see Section 2.1), and similarly for , , , .
As mentioned in Section 1, the eigenvalue bound makes sense when is the adjacency matrix of an infinite graph (with bounded degree). However does not extend to the infinite case, as the number “” appearing in its definition is not finite. On the other hand, we now introduce a new, intermediate, “maximum eigenvalue-like” bound that is appropriate for matrix-weighted graphs. This is the “partitioned SDP bound” appearing in the statement of our main Section 1.2. In the following Section 3, we will show that it generalizes well to the case of infinite graphs. [Partitioned SDP bound] Let be an Hermitian matrix with entries from . We define its partitioned SDP bound to be
the matrices are also thought of as matrices with entries from ;
is interpreted as ;
denotes the sum of the diagonal entries of , which is an matrix;
is the set of correlation matrices;
in other words, the final condition is that for all .
As mentioned, the partitioned SDP bound can be viewed as “intermediate” between the eigenvalue bound and the SDP bound. To explain this, suppose is an Hermitian matrix. On one hand, we can regard as an matrix with matrix entries (); in this viewpoint, . On the other hand, we can regard as a matrix with a single matrix entry (); in this viewpoint, .
It is easy to see that the partitioned SDP bound indeed has an SDP formulation, and we now state its SDP dual: The SDP dual of is the following:
Weak SDP duality, , holds as always, but again it is not obvious that strong SDP duality holds. In fact, not only does strong duality hold, it even holds in the case of infinite matrices . This fact is crucial for our work, and proving it the subject of the upcoming technical section.
3 The infinite SDPs
This technical section has two goals. First, in Section 3.1 we show that strong duality holds with , even for infinite matrices with entries. Even in the finite case this is not trivial, as the feasible region for the SDP
is not full-dimensional, and hence the Slater condition ensuring strong duality does not apply. The infinite case involves some additional technical considerations, needed so that we may eventually apply the strong duality theorem for conic linear programming of Bonnans and Shapiro[BS00, Thm. 2.187]. Second, in Section 3.2, we show that in the optimization problem , values arbitrarily close to the optimum can be achieved by matrices of finite support (i.e., with only finitely many nonzero entries). Indeed (though we don’t need this fact), these finite-rank need only have rank at most . This fact is familiar from the case of , where the optimizer in the eigenvalue bound Section 2.4 is achieved by a of rank (namely for any maximum eigenvector ). Finally, in Section 3.3 we consolidate all these results into a theorem statement suitable for use with graphs produced by infinite lifts of matrix polynomials.
3.1 SDP duality
Let be a countable set and write for the associated (complex, separable) Hilbert space of square-summable functions . We write , , for the spaces of finite-rank, trace-class, and bounded operators on , respectively. Focusing on (with the weak topology) and (with the -weak topology), these are locally convex topological vector spaces forming a dual pair with bilinear map defined by (see [RS80, Thm. VI.26], [LAN17, Thm. B.146], or [BR02, Prop. 2.4.3]). Recall [LAN17, Lem. B.142] that , where is the trace-norm.
Write (respectively, ) for the (closed) real subspace of self-adjoint operators in (respectively, ); note that and continue to form a dual pair (see, e.g., [MEY06, p. 212]). Recall that is positive semidefinite if and only if for all (for such we have ); as usual we write to mean that is positive semidefinite. Letting (respectively, ) denote the positive semidefinite operators in (respectively, ), we have that these are both (nonempty) closed, convex cones (for , see [CON90, Prop. 3.7], [LAN17, Prop. C.51]; for see [FRI19, Sec. 4], [vE20, App. A.2], [SH08, Sec. 2]); further, they are topologically dual cones [vE20, App. A.2].
We now introduce an SDP that is equivalent to our partitioned SDP SDP; however, we will express it with scalar entries (rather than matrix entries). Fix any . Let be a partition of into nonempty parts, and for let be the operator . Consider the following semidefinite program:
To relate this back to our definition of (for infinite matrices), suppose that is a self-adjoint matrix, indexed by countable vertex , with entries (and only finitely many nonzero entries per row/column). Let be its “extension”, with rows/columns indexed by , and let . Then (SDP-P) is precisely how we define .
We remark that (SDP-P) is always feasible. By summing the constraints on we get ; i.e., is required to be a density operator. In particular, always so the optimum value of (SDP-P) is finite.
We would like to show that the optimum value of (SDP-P) is equal to that of the following dual semidefinite program which, in the setup mentioned above, is equivalent to our definition of :
Showing this will take a couple of steps. The first is to mechanically write down the Lagrangian dual of (SDP-P), which is:
We first show that (D1) is equivalent to (SDP-D). We may reparameterize all by defining and writing ; in this way, every corresponds to a pair and satisfying , and vice versa. Under this reparameterization, (D1) becomes
The second constraint here is equivalent to , and hence the optimal choice of given is achieved by . Thus (D1) is equivalent to (D2) is equivalent to (SDP-D), as claimed.
We would next like to claim that strong duality holds for the dual pair (SDP-P) and (D1); however, there is a difficulty because the feasible region of (SDP-P) only has nonempty relative interior, not interior. Alternatively, one might say the difficulty is that the feasible region for (D1) is not bounded. However we can fix this by introducing the following equivalent bounded variant:
(D3) is equivalent to (D1).
Suppose we have any feasible solution for (D1). First observe that for any , if we take an arbitrary we have
Next observe that the optimum value of (D1) is at most (since we could take for all ); thus we may restrict attention to ’s that achieve objective value at most . But now we claim any such feasible will have for any . To see this, observe (using Equation 3) that
Thus if , it is strictly not optimal. We conclude that we may add the constraints for to (D1) without changing it. Clearly it doesn’t hurt to widen this interval to (for the sake of symmetry), and so we conclude that (D3) is indeed equivalent to (D1). ∎
We can now write the Lagrangian dual of (D3), which is