1 Introduction
Semidefinite programs (SDPs) constitute an important class of convex constrained optimization problems that is ubiquitous in statistics, signal processing, control systems, and other areas. In several applications, the data of the problem vary over time, so that this can be modeled as a timevarying semidefinite program (TVSDP). In this paper we consider TVSDPs of the form
(SDP_{t})  
s.t.  
where is a time parameter varying on a bounded interval.
Here denotes the space of real symmetric matrices, is a linear operator defined by for some , , and . Throughout the paper denotes the Frobenius inner product and the constraint requires to be positive semidefinite (PSD). In this timevarying setting, one looks for a solution curve in such that is an optimal solution for (SDP_{t}) at each time point .
Timedependent problems leading to TVSDPs occur in various applications, such as optimal power flow problems in power systems [33]
, state estimation problems in quantum systems
[1], modeling of energy economic problems [19], jobshop scheduling problems [7], as well as problems arising in signal processing, queueing theory [36] or aircraft engineering [43]. TVSDPs can be seen as a generalization of continuous linear programming problems, which were first studied by Bellman [11] in relation to socalled bottleneck problems in multistage linear production economic processes. Since then, a large body of literature has been devoted to studying continuous linear programs with and without additional assumptions. However, the generalization of this idea to other classes of optimization problems has only recently been considered. In [46], Wang et al. study separated continuous conic programs, and finally Ahmadi and Khadir [3] consider timevarying SDPs. In contrast to our setting, they study the data to vary polynomially with time and also restrict themselves to polynomial solutions.A naive approach to solve the timevarying problem (SDP_{t}) is to consider, at a sequence of times , the instances of the problem (SDP) for and solve them one after another. The best solvers for SDPs are interior point methods [29, 44, 6, 22, 30], which can solve them in a time that is polynomial in the input size. However, these solvers do not scale particularly well, and thus this bruteforce approach may fail in applications where the volume and velocity of the data are large. Furthermore, such a straightforward method would not make use of the local information collected by solving the previous instances of the problem. Even if one considers warm starts [24, 21, 18, 42], the reduction in run time is likely to be marginal. For instance, [42, sections 5.5 and 5.6] reports a 30–60% reduction of the runtime on a collection of timevarying instances of their own choice.
Instead, in this work, we would like to utilize the idea of socalled pathfollowing predictorcorrector algorithms as developed in [25, 5]. In classical predictorcorrector methods, a predictor step for approximating the directional derivative of the solution with respect to a small change in the time parameter is applied, together with a correction step that moves from the current approximate solution closer to the next solution at the new time point. The latter is based on a Newton step for solving the firstorder optimality KKT conditions.
A limiting factor in solving both stationary and timedependent SDPs is computational complexity when is large. A common solution to this obstacle is the Burer–Monteiro approach, as presented in the seminal work [15, 16]. In this approach, a lowrank factorization of the solution is assumed with and potentially much smaller than . In the optimization literature, the Burer–Monteiro method has been very well studied as a nonconvex optimization problem, e.g. in terms of algorithms [31], quality of the optimal value [9, 39], and (global) recovery guarantees [14, 13, 17].
In a timevarying setting, the Burer–Monteiro factorization leads to
(BM_{t})  
s.t. 
which for every fixed is a quadratically constrained quadratic problem. A solution then is a curve in , which, depending on , is a space of much smaller dimension than . However, this comes at the price that the problem (BM_{t}) is now nonconvex. Moreover, theoretically it may happen that local optimization methods converge to a critical point that is not globally optimal [45], although in practice the method usually shows very good performance [15, 31, 40].
The aim of this work is to combine the Burer–Monteiro factorization with pathfollowing predictorcorrector methods and to develop a practical algorithm for approximating the solution of (BM_{t}), and consequently of (SDP_{t}), over time. As we explain in section 3, to apply such methods, we need to address the issue that the solutions of (BM_{t}) are never isolated, due to the nonuniqueness of the Burer–Monteiro factorization caused by orthogonal invariance. In this paper, we apply a wellknown technique to handle this problem by restricting the solutions to a socalled horizontal space at every time step. From a geometric perspective, such an approach exploits the fact that equivalent factorizations can be identified as the same element in the corresponding quotient manifold with respect to the orthogonal group action [34].
The paper is structured as follows. In section 2 we review important foundations from the SDP literature and state the main assumptions we make on the TVSDP problem (SDP_{t}). Section 3 presents the underlying quotient geometry of positive semidefinite rank matrices from a linear algebra perspective, focusing in particular on the notion of horizontal space and the injectivity radius of the map . We then describe in section 4 our pathfollowing predictorcorrector algorithm, which is based on iteratively solving the linearized KKT system for (BM_{t}) over time. A main result is the rigorous error analysis for this algorithm presented in subsection 4.3. In section 5
, we showcase numerical results that test our method on a timevarying variant of the wellknown Goemans–Williamson SDP relaxation for the MaxCut problem in combinatorial optimization and graph theory. We conclude in section
6 with a brief discussion of our results.2 Preliminaries and key assumptions
Naturally, the rigorous formulation of pathfollowing algorithms requires regularity assumptions on the solution curve. In our context, this will require both assumptions on the original TVSDP problem (SDP_{t}) as well as on its reformulation (BM_{t}). In particular for the latter, the correct choice of the dimension is crucial. In what follows, we present and discuss these assumptions in detail.
First, we briefly review some standard notions and properties for primaldual SDP pairs; see [4, 47]. Consider the conic dual problem of (SDP_{t}):
(DSDP_{t})  
s.t. 
where is the linear operator adjoint to . For convenience, we often drop the explicit dependence on and refer to a solution of (DSDP_{t}) simply as . While reviewing the basic properties of SDPs, we assume the time parameter to be fixed and hence omit the subindex .
The KKT conditions for the pair of primaldual convex problems (SDP_{t})(DSDP_{t}) read
(1)  
These are sufficient conditions for the optimality of the pair .
Definition 2.1 (Strict feasibility).
We say that strict feasibility holds for an instance of primal SDP if there exists a positive definite matrix that satisfies
. Similarly, strict feasibility holds for the dual if there exist a vector
satisfying .It is wellknown that under strict feasibility the KKT conditions are also necessary for optimality. Note that, in general, a pair of optimal solutions satisfies the inclusions and , where and denote the image and kernel, respectively.
Definition 2.2 (Strict complementarity).
A primaldual optimal point is said to be strictly complementary if (or, equivalently, ). A primaldual pair of an instance of SDP satisfies strict complementarity if there exists a strictly complementary primaldual optimal point .
Definition 2.3 (Nondegeneracy).
A primal feasible point is primal nondegenerate if
(2) 
with being the tangent space to the manifold of fixed rank symmetric matrices at , where . Let be a rankrevealing decomposition, then
A dual feasible point is dual nondegenerate if
where is the tangent space at to the manifold of fixed rank symmetric matrices with .
Primaldual strict feasibility implies the existence of both a primal and a dual optimal solution with a zero duality gap. In addition, primal (dual) nondegeneracy implies dual (primal) uniqueness of the solutions. Under strict complementarity, the converse is also true, that is, primal (dual) uniqueness of the primal dual optimal solutions implies dual (primal) nondegeneracy of these solutions. Moreover, primaldual nondegeneracy and strict complementarity hold generically. We refer to [4] for details.
With regard to the timevarying case, these facts can be generalized as follows.
Theorem 2.4.
(Bellon et al., [12, Theorem 2.19]) Let (P_{t},D_{t}) be a primaldual pair of TVSDPs parametrized over a time interval such that primaldual strict feasibility holds for any and assume that the data are continuously differentiable functions of . Let be a fixed value of the time parameter and suppose that is a nondegenerate optimal and strictly complementary point for (P_{t∗},D_{t∗}). Then there exists and a continuously differentiable unique mapping defined on such that is a unique and strictly complementary primaldual optimal point to (P_{t},D_{t}) for all . In particular, the ranks of and are constant for all .
Based on these facts, for the initial problem (SDP_{t}) we make the following assumptions.

[label=(A0)]

The linear operator is surjective in any .

The solution pair is strictly complementary for any .

Data are continuously differentiable functions of .
Assumption 1 is standard for SDPs and in linearly constrained optimization in general, while assumptions 2–3 rule out many “pathological” cases [12]. In particular, assumption 2 implies that the solution pair is unique. By Theorem 2.4 and assumptions 2, 3, and 4 have the following consequences:

[label=(C0)]

(SDP_{t}) has a unique and smooth solution curve , .

The curve is of constant rank .
For setting up the factorized version (BM_{t}) of (SDP_{t}), it is necessary to choose the dimension of the factor matrix in (BM_{t}), ideally equal to of 2. In what follows, we assume that we know the constant rank . Given access to an initial solution at time , it is possible to compute , so this assumption is without further loss of generality.
It is worth noting that the rank cannot be arbitrary. Based on a known result of Barvinok and Pataki [10], for any SDP defined by linearly independent constraints, there always exists a solution of rank such that . Since we assume that is the unique solution to (SDP_{t}) with constant rank we conclude that
We point out that recently the Barvinok–Pataki bound has been slightly improved [28].
3 Quotient geometry of PSD rank matrices
We now investigate the factorized formulation (BM_{t}) in more detail. As already mentioned, in contrast to the original problem (SDP_{t}), this is a nonlinear problem (specifically, a quadratically constrained quadratic problem) which is nonconvex. Moreover, the property of uniqueness of a solution, which is guaranteed by 1 for the original problem (SDP_{t}), is lost in (BM_{t}), because its representation via the map
is not unique. In fact, this map is invariant under the orthogonal group action
on , where
with denoting the identity matrix, is the orthogonal group. Hence both the objective function and the constraints in (BM_{t}) are invariant under the same action. As a consequence, the solutions of (BM_{t}) are never isolated [31]. This poses a technical obstacle to the use of pathfollowing algorithms, as the path needs to be, at least locally, uniquely defined.
On the other hand, by assuming that the correct rank of a unique solution for (SDP_{t}) has been chosen for the factorization, any solution for (BM_{t}) must satisfy (and in particular be of full column rank). From this it easily follows that the above action of the orthogonal group is indeed the only source of nonuniqueness. This corresponds to the wellknown fact that the set of positive definite fixed rank symmetric matrices, which we denote by , is a smooth manifold that can be identified with the quotient manifold , where is the open set of matrices with full column rank.
In the following, we describe how the nonuniqueness can be removed by introducing the socalled horizontal space, which is a standard concept in optimization on quotient manifolds [2]. For positive semidefinite fixedrank matrices, this has been worked out in detail in [34]. Additional material, including the complex Hermitian case, can be found in [8]. However, in order to arrive at practical formulas that are useful for our pathfollowing algorithm later on, we will not further refer to the concept of a quotient manifold but directly focus on the injectivity of the map on a suitable linear subspace, which we describe in the following section.
3.1 Horizontal space and unique factorizations
Given , we denote the corresponding orbit under the orthogonal group as
The orbit is an embedded submanifold of of dimension with two connected components, according to . Its tangent space at , which we denote by , is easily derived by noting that the tangent space to the orthogonal group
at the identity matrix equals the space of real skewsymmetric matrices
(see, e.g., [2, Example 3.5.3]). Therefore,Since the map is constant on , its derivative
vanishes on , that is .
The horizontal space at , denoted by , is the orthogonal complement of with respect to the Frobenius inner product. One verifies that
since holds for all skewsymmetric if and only if is symmetric. We point out that sometimes any subspace complementary to is called a horizontal space, but we will stick to the above choice, as it is the most common and has certain theoretical and practical advantages. In particular, since , we have that is just a linear space.
The purpose of the horizontal space is to provide a unique way of representing a neighborhood of in through with . Clearly,
Moreover, the following holds.
Proposition 3.1.
The restriction of to is injective. In particular, it holds that
where
is the smallest singular value of
. This bound is sharp if . For one has the sharp estimateAs a consequence, in either case, .
Proof.
For we have by standard properties of the trace. Taking yields
To derive the second equality we used for . Clearly, and if we also have that . This proves the asserted lower bounds. To show that they are sharp, let be a (normalized) singular vector tuple such that . If , then for any such that one verifies that the matrix is in and achieves equality. When , achieves it. ∎
Since maps to , which is of the same dimension as , the above proposition implies that is a bijection between and . This already shows that the restriction of to the linear space is a local diffeomorphism between a neighborhood of in and a neighborhood of in . The subsequent more quantitative statement matches the Theorem 6.3 in [34] on the injectivity radius of the quotient manifold . For convenience we will provide a selfcontained proof that is more algebraic and does not require the concept of quotient manifolds.
Proposition 3.2.
Let . Then the restriction of to is injective and maps diffeomorphically to a (relatively) open neighborhood of in .
It is interesting to note that is the largest possible ball in on which the result can hold, since the rankone matrices comprised of singular pairs of all belong to and is rankdeficient. Another important observation is that does not depend on the particular choice of within the orbit .
Proof.
Consider . Let
be a singular value decomposition of
with and having orthonormal columns. We assume . Then by we denote a matrix with orthonormal columns and . In the case , the terms involving in the following calculation are simply not present. We writeSince , we have
Then a direct calculation yields
and analogously for . Since the four terms in the above sum are mutually orthogonal in the Frobenius inner product, the equality particularly implies
as well as
(3) 
The first of these equations can be written as
By Proposition 3.1 (with , and ),
whereas
Since , this shows that we must have , which then by (3) also implies , since is invertible.
Hence, we have proven that is an injective map from to . To validate that it is a diffeomorphism onto its image we show that it is locally a diffeomorphism, for which again it suffices to confirm that is injective on for every (since and have the same dimension). It follows from Proposition 3.1 (with replaced by , which has full column rank) that the null space of equals . We claim that , which proves the injectivity of on . Indeed, let be an element in the intersection, i.e., for some skewsymmetric and . Inserting the first relation into the second, and using , yields the homogenuous Lyapunov equation
(4) 
The symmetric matrix
in (4) is positive definite, since (here denotes the
th eigenvalue of the corresponding matrix). But in this case (
4) implies , that is, . ∎Finally, it is also possible to provide a lower bound on the radius of the largest ball around such that its intersection with is in the image (so that an inverse map is defined).
Proposition 3.3.
Any satisfying is in the image , that is, there exists a unique such that .
Observe that one could take
(5) 
as a slightly cleaner sufficient condition in the proposition.
Proof.
Let with and assume a polar decomposition of , where , is positive semidefinite and is orthogonal. Let . Then
(6) 
satisfies , and since is symmetric, we have . We need to show , that is, . Proposition 3.2 then implies that is unique in . Let be the orthogonal projector onto the column span of and . With that, we have the decomposition
(7) 
We estimate both terms separately. Since is symmetric and positive semidefinite, the first term satisfies
(8) 
A simple consideration using a singular value decomposition of and reveals that
for some and with orthonormal columns. Consequently, by von Neumann’s trace inequality (see, e.g., [27, Theorem 7.4.1.1]), we have
Inserting this in (3.1) yields
We remark that we could have concluded this inequality from [8, Theorem 2.7] where it is also stated. It actually holds for any for which is symmetric and positive semidefinite using the same argument (in particular for replaced with the initial ). Let now be a singular value decomposition of with the smallest positive singular value. Then for some positive semidefinite and it follows from wellknown results, cf. [41], that^{1}^{1}1For completeness we provide the proof. The matrix is the unique solution to the matrix equation . Indeed, the linear operator on is symmetric in the Frobenius inner product and has positive eigenvalues
(the eigenvectors are rankone matrices
with the eigenvectors of ). Hence .Noting that and we conclude the first part with
(9) 
The second term in (7) can be estimated as follows:
(10) 
where we used the CauchySchwarz inequality and the fact that has rank at most .
3.2 A time interval for the factorized problem
We now return to the factorized problem formulation (BM_{t}). Let be an optimal solution of (BM_{t}) at some fixed time point (so that and ). Based on the above propositions we are able to state a result on the allowed time interval for which the factorized problem (BM_{t}) is guaranteed to admit unique solutions on the horizontal space corresponding to the original problem (SDP_{t}). For this, exploiting the smoothness of the curve , we first define
(12) 
a uniform bounds on the time derivative, as well as
(13) 
on the smallest eigenvalue of , are available for . Notice that the existence of such bounds is without any further loss of generality: the existence of follows from 1, which guarantees that is a smooth curve, while the existence of is guaranteed by 2, since has a constant rank.
Theorem 3.5.
Proof.
It suffices to show that for in the asserted time interval the solutions of (SDP) lie in the image . By Proposition 3.3, this is the case if , where is the smallest eigenvalue of . Since
and , the condition is sufficient. Then Proposition 3.2 provides the smooth solution curve for problem (BM_{t}). ∎
The results of this section motivate the definition of a version of (BM_{t}) restricted to , which we provide in the next section.
4 Path following the trajectory of solutions
In this section, we present a pathfollowing procedure for computing a sequence of approximate solutions at different time points that tracks a trajectory of solutions to the Burer–Monteiro reformulation (BM_{t}). From this sequence we are then able to reconstruct a corresponding sequence of approximate solutions tracking the trajectory of solutions for the full space TVSDP problem (SDP_{t}). The pathfollowing method is based on iteratively solving the linearized KKT system. Given an iterate on the path, we explained in the previous section how to eliminate the problem of nonuniqueness of the path in a small time interval by considering problem (BM_{t}) restricted to the horizontal space . We now need to ensure that this also addresses the question of the invertibility of the linearized KKT system. We show in Theorem 4.2 that this is indeed guaranteed under standard regularity assumptions on the original problem (SDP_{t}). This is a remarkable fact of somewhat independent interest.
4.1 Linearized KKT conditions and secondorder sufficiency
Given an optimal solution at time , we aim to find a solution at time . By the results of the previous section, the next solution can be expressed in a unique way as
where is in the horizontal space , provided that is small enough.
We define the following maps:
(14)  
By definition, if and only if . For symmetry reasons we use the equivalent condition (which reflects the fact that is actually a linear space).