DeepAI
Log In Sign Up

Time-Varying Semidefinite Programs

08/12/2018
by   Amir Ali Ahmadi, et al.
0

We study time-varying semidefinite programs (TV-SDPs), which are semidefinite programs whose data (and solutions) are functions of time. Our focus is on the setting where the data varies polynomially with time. We show that under a strict feasibility assumption, restricting the solutions to also be polynomial functions of time does not change the optimal value of the TV-SDP. Moreover, by using a Positivstellensatz on univariate polynomial matrices, we show that the best polynomial solution of a given degree to a TV-SDP can be found by solving a semidefinite program of tractable size. We also provide a sequence of dual problems which can be cast as SDPs and that give upper bounds on the optimal value of a TV-SDP (in maximization form). We prove that under a boundedness assumption, this sequence of upper bounds converges to the optimal value of the TV-SDP. Under the same assumption, we also show that the optimal value of the TV-SDP is attained. We demonstrate the efficacy of our algorithms on a maximum-flow problem with time-varying edge capacities, a wireless coverage problem with time-varying coverage requirements, and on bi-objective semidefinite optimization where the goal is to approximate the Pareto curve in one shot.

READ FULL TEXT VIEW PDF
10/15/2022

Time-Varying Semidefinite Programming: Path Following a Burer–Monteiro Factorization

We present an online algorithm for time-varying semidefinite programs (T...
02/26/2021

How do exponential size solutions arise in semidefinite programming?

As a classic example of Khachiyan shows, some semidefinite programs (SDP...
02/01/2016

Solving rank-constrained semidefinite programs in exact arithmetic

We consider the problem of minimizing a linear function over an affine s...
07/20/2019

Efficient Bayesian PARCOR Approaches for Dynamic Modeling of Multivariate Time Series

A Bayesian lattice filtering and smoothing approach is proposed for fast...
04/18/2022

Collusion-resistant fingerprinting of parallel content channels

The fingerprinting game is analysed when the coalition size k is known t...
09/05/2018

Predicting Smoking Events with a Time-Varying Semi-Parametric Hawkes Process Model

Health risks from cigarette smoking -- the leading cause of preventable ...

1 Introduction

We study semidefinite programs (SDPs) whose feasible set and objective function depend on time. More specifically, a time-varying semidefinite program (TV-SDP) is an optimization problem of the form

(1)

Here, the operator is defined as

(2)

where

and

The data to the problem consists of , for , and for , which satisfies the requirement that be a measurable function in for all and that , where is any matrix norm. For a symmetric matrix , we write to denote that

is positive semidefinite, i.e., has nonnegative eigenvalues. The abbreviation

a.e. indicates that the matrix inequality in (1) should hold “almost everywhere”; i.e., for every , where is some set of measure zero with respect to the Lebesgue measure.

For an interval , we define the set

With this notation, a feasible solution to the TV-SDP in (1) is a function that satisfies the constraint

(3)

and the feasible set of the TV-SDP is the set . The choice of the interval is of course made for convenience. Without loss of generality, we can reduce any bounded interval , with , to the interval by performing the change of variable .

We equip and respectively with the inner products and defined as

and

where stands for the trace of a matrix . Using the notation for the first inner product above, the TV-SDP in (1) can be written more compactly as

The terms in (2) are called kernel terms and broaden the class of problems that can be modelled as a TV-SDP. The special case where the terms are identically zero is already interesting and presents an infinite sequence of SDPs indexed by time . While these SDPs are in principle independent of each other, basic strategies such as sampling

and solving a finite number of independent SDPs generally fail to provide a solution to the TV-SDP. This is because candidate functions obtained from simple interpolation schemes can violate feasibility in between sample points. When the terms

are not zero, the value that a solution takes at a given time affects the range of values that it can take at other times. When the terms are constant functions of and for instance, the TV-SDP in (1) is already powerful enough to express linear constraints involving the function and its derivatives and/or integrals of any order. For example, to impose a constraint on , one can introduce a new decision variable which is related to via the linear constraint .

In this paper, we consider the data of the TV-SDP in (1) to belong to the class of polynomial functions. Our interest in this setting stems from two reasons. On the one hand, the set of polynomial functions is dense in the set of continuous functions on and hence powerful enough for modeling purposes. On the other hand, polynomials can be finitely parameterized (in the monomial basis for instance) and are very suitable for algorithmic operations.

Even when the input data to a TV-SDP is polynomial, there is no reason to expect its optimal solution to be a polynomial or even a continuous function. Nevertheless, we concern ourselves in this paper with finding feasible polynomial solutions to a TV-SDP (which naturally provide lower bounds on its optimal value). Our motivation for making this choice is twofold. First, solutions that are smooth are often preferred in practice. Consider for example the problem of scheduling generation of electric power when daily user consumption varies with time, or that of finding a time-varying controller for a robotic arm that serves some routine task in a production line. In such scenarios, smoothness of the solution is important for avoiding deterioration of the hardware or guaranteeing safety of the workplace. Continuity of the solution is even more essential as physical implementation of a discontinuous solution is not viable. Our second motivation for studying polynomial solutions is algorithmic. We will show (cf. Section 3.2) that optimal polynomial solutions of a given degree to a TV-SDP with polynomial data can be found by solving a (non time-varying) SDP of tractable size. These observations call for a better understanding of the power of polynomial solutions as their degree increases, or a methodology that can bound their gap to optimality when their degree is capped. These considerations are the subjects of Section 3.1 and Section 4 respectively.

As an illustration of a TV-SDP with polynomially time-varying data and a preview of our solution technique, consider problem (1) with and data

Figure 1: An example of a TV-SDP

As the kernel terms are identically zero here, an optimal solution to this TV-SDP is a function such that for all in (except possibly on a set of measure zero), is a maximizer of under the constraints . In Figure 1, the dotted red line represents the optimal polynomial solution of degree . The feasible set

for some sample times is delimited by blue lines. The objective function is represented by a black arrow, which also moves in time. The feasible solution achieves an objective value of . By solving an inexpensive dual problem (with in problem (13) of Section 4), we can conclude that the optimal value of the TV-SDP cannot be greater than . Moreover, we can get arbitrarily close to the exact optimal value of the TV-SDP by increasing the degree of the candidate polynomial solutions (cf. Section 3.1) or the level in the hierarchy of our dual problems (cf. Section 4.1).

1.1 Related literature

Time-varying SDPs contain as special case the time-varying versions of most common classes of convex optimization problems, including linear programs, convex quadratic programs, and second-order cone programs. In the linear programming case, this problem has been studied in the literature under the name of continuous linear programs (CLPs). In its most general form, a CLP is a problem of the type

(4)

where , and , for all .

This problem was introduced by Bellman [8] and has since been studied by several authors who have provided algorithms, structural results, or a duality theory for CLPs; see e.g. [25, 40, 26, 41, 19, 11, 37, 4, 29, 38, 17, 44] and references therein. Several applications, e.g. in manufacturing, transportation, robust optimization, queueing theory, and revenue management, can also be found in these references. Since CLPs are perceived as a hard problem class in general, most authors make additional assumptions on how the problem data varies with time, or, in the case of the so-called “separated CLPs”, how the kernel terms and the non-kernel terms interact [29, 17, 44, 11, 37, 4].

The closest work in the CLP literature to our work is the paper [7] by Bampou and Kuhn. The authors of this paper also assume that the data of the their CLP varies polynomially with time and employ semidefinite programming to approximate the optimal solution by polynomial (and piecewise polynomial) functions of time. Our approach here generalizes their nice algorithms and convergence guarantees to the SDP setting. In [7], the authors also make use of the rich duality theory of CLPs to get a sequence of upper bounds that converges to the optimal value of (4) under certain conditions. The duality framework that we present in this paper is different in nature and is closer in spirit to the approach in [23][6]. As it turns out, it suffices for us to assume boundedness of the primal feasible set to guarantee convergence of our dual bounds to the optimal value of the TV-SDP.

The only generalization of continuous linear programs that we are aware of appears in the work of Wang, Zhang, and Yao in [43], which makes a number of important contributions to separated continuous conic programs. The assumptions in [43] are however stronger than the ones we make here. In particular, there are separation assumptions on the kernel and non-kernel terms in [43] and the data to the problem is assumed to vary only linearly with time. Another work related to this paper is the work by Lasserre in [23], which studies a parametric polynomial optimization problem of the form

(5)

where

is a probability distribution on some compact basic semialgebraic set

, and are polynomials functions of and . An inequality involving is valid if it is valid for all in except on some set with . When the kernel terms in (2) are zero, problem (1) can in theory be put in the form of (5) by setting and replacing the semidefinite constraint with nonnegativity of all polynomials that form the principal minors of . Our duality framework in Section 4 is inspired by the approach in [23]. However, as we are dealing with a much more structured problem, we are also able to find the best polynomial solution of a given degree to (1) with an SDP of tractable size, as well as prove asymptotic optimality of polynomial solutions even in presence of the kernel terms.

Finally, we remark that at a broader level, the idea of using semidefinite programming to find polynomial solutions (or “policies”) to dynamic or uncertain optimization problems has been applied before to questions in multi-stage robust and stochastic optimization; see e.g. [10] and [6].

1.2 Organization and contributions of the paper

This paper is organized as follows. In Section 2, we prove that under a boundedness assumption, the optimal value of the TV-SDP in (1) is attained (Theorem 3). This proof is obtained by combining two theorems that are used also in other sections of the paper. The first (Theorem 1) shows that a sequence of linear functionals that satisfies a certain boundedness property on nonnegative polynomials has a weakly convergent subsequence. The second (Theorem 2) shows that when a weakly convergent sequence of functions in satisfies linear inequalities of the type in (3), then so does its weak limit.

In Section 3, we prove that under a strict feasibility assumption, polynomial solutions are arbitrarily close to being optimal to the TV-SDP in (1) (Theorem 4). We also show that this assumption cannot be removed in general (Example 1). Furthermore, we show how sum of squares techniques combined with certain matrix Positivstellensatzë enable the search for the best polynomial solution of a given degree to be cast as an SDP of polynomial size (Theorem 6).

In Section 4, we develop a hierarchy of dual problems (or relaxations) that give a sequence of improving upper bounds on the optimal value of the TV-SDP in (1). We show that under a boundedness assumption, these upper bounds converge to the optimal value of the TV-SDP (Theorem 7). We also show that our dual problems can be cast as SDPs (Theorem 8). For a given TV-SDP, the dimensions of the matrices that feature in both our primal and dual SDP hierarchies grow only linearly with the order of the hierarchy.

In Section 5, we present applications of time-varying semidefinite programs to a maximum-flow problem with time-varying edge capacities, a wireless coverage problem with time-varying coverage requirements, and to bi-objective semidefinite optimization where the goal is to approximate the Pareto curve in one shot. Finally, we end with some future research direction in Section 6.

1.3 Notation

We denote

  • the entry of a matrix by ,

  • the trace of a matrix by ,

  • the vector of all ones by

    ,

  • the identity matrix by

    ,

  • the diagonal matrix with the vector on its diagonal by ,

  • the standard inner product in by ; i.e., for two vectors , ,

  • the infinity-norm of a vector by ; i.e., for a vector , ,

  • the set of (constant) symmetric matrices by and its subset of positive semidefinite matrices by ,

  • the degree of a polynomial by (when is a vector of polynomials, denotes the maximum degree of its entries),

  • the set of matrices whose components are polynomials in the variable with real coefficients by . For , denotes the subset of consisting of matrices whose entries are polynomials of degree at most . When , we simply use the notation and , and when as well, we simplify the notation to and .

  • We denote the set of linear functionals on by .

  • For , we denote by the unique linear functional that satisfies

  • For a function , we denote by the element of defined by

2 The Optimal Value of a Bounded TV-SDP is Attained

In this section, we study the following question: If the optimal value of (1) is finite (i.e., the problem is feasible and bounded above), does there exist a function such that ? Many of the arguments given here will be used again in Section 4 on duality theory.

The question of attainment of the optimal value (i.e., existence of solutions) is a very basic one and has been studied in the continuous linear programming literature already; see e.g. [19]. In the TV-SDP case, note that even for standard SDPs that do not depend on time, the optimal value is not always attained unless the feasible set is bounded. We prove in this section that under the following boundedness assumption

(6)

the optimal value of the TV-SDP in (1) is always attained. This is not an immediate fact as the search space is infinite dimensional. The idea is to prove that a sequence of feasible solutions to a TV-SDP whose objective value approaches the optimal value must have a converging subsequence and that the limit of the subsequence must also be feasible. It turns out that the right notion of convergence in this context is weak convergence. We begin by stating the definition, and then prove that the weak limit of a sequence of feasible solutions is again feasible.

Definition 1 (Weak Convergence)

A sequence of linear functionals in converges weakly to a linear functional (we write ) if for all

Similarly, a sequence of functions in converges weakly to a function (we write ) if as .

The next theorem shows a compactness result for the set .

Theorem 1

Let be a sequence of linear functionals in . If the following implication holds for every and every polynomial :

then there exists a function and a subsequence of that converges weakly to .

In the proof of this theorem, we will invoke the following lemma, which is obtained by a direct application of a result of Lasserre [24, Theorem 3.12a].111To get the statement of the lemma, apply [24, Theorem 3.12a] with where is the constant function equal to one, and observe that for any the polynomials are nonnegative on the interval .

Lemma 1 (See Theorem 3.12a in [24])

For a linear functionals , if there exists a scalar such that the inequalities

hold for every polynomial that is nonnegative on , then there exists a function such that

Proof of Theorem 1. The ideas of the proof are inspired by those in [42, Chap. 7]. Let be a basis of where all entries of the polynomials are of the form for some nonnegative integer . Let denote the maximum degree of the entries of . It is clear by assumption that for every such that . Consider the sequence of real numbers . This sequence is bounded in absolute value by . As such, it has a convergent subsequence . Next consider . Again, this is a sequence of real numbers that is bounded in absolute value by and so it has a convergent subsequence . Iterating this procedure, we obtain, for each integer , a subsequence of linear functionals with the property that . Moreover, for all with , the sequence of numbers converges as . Now consider the diagonal sequence of linear functionals . For every , converges as as the sequence of linear functionals is a subsequence of . Since the functions span and the elements of the sequence are linear functionals, the sequence converges for all polynomial functions . Let be the linear functional defined by

(7)

We have just proven that the sequence converges weakly to . The claim of the theorem would be established if we show that there exists a function such that In order to get this statement from Lemma 1, for , let be defined as

Let be a polynomial that is nonnegative on . Take to be the vector-valued polynomial whose entries are all identically zero except for the one that is equal to . From (7) we see that

Since for larger than the degree of ,

we have that that , and therefore

Similarly, it is straightforward to argue that . Hence, by Lemma 1, for each , there exists a function such that Therefore,

The function that we were after can hence be taken to be

The next theorem shows that when all functions in a sequence satisfy linear inequalities of the type in (3), their weak limit does the same.

Theorem 2

Let the operator be as in (2). If a sequence of functions in converges weakly to a function and satisfies for all , then

To prove this theorem, we need the following lemma, which also implies that the set is self-dual.

Lemma 2

For any function , if and only if

Proof. The only if part is straightforward. For the other direction, fix and assume that for all . For , let be the smallest eigenvalue of and

be an associated eigenvector of norm one. Denote by

the univariate function over that is equal to 1 when and zero otherwise. Let We claim that . This would imply that

which proves that is nonegative almost everywhere on ; i.e., the desired result.

To prove the claim, observe that since continuous functions are dense in the space of bounded and measurable functions on (see e.g. [1, Theorem 2.19]), for every positive integer , there exist continuous functions and such that

Notice that without loss of generality we can assume that for all and we have as

The Stone-Weierstrass theorem (see e.g. [39]) can now be utilized to conclude that for every positive integer , there exist polynomial functions , such that

We can thus assume without loss of generality again that the functions and are polynomial functions of the variable .

Now let . Then (i) and (ii) as , where here denotes the norm associated to the scalar product . From the Cauchy-Schwarz inequality we have

As (i) implies that for all , and (ii) implies that the right hand side of the above inequality goes to zero as goes to infinity, we conclude that

Proof of Theorem 2. For an element , we denote by the element of defined by . By applying Fubini’s double integration theorem on the region , it is straightforward to see that

where is the adjoint of the affine operator (see equation (12) in Section 4 for its explicit expression). Now fix a function . Using the easy direction of Lemma 2 and the fact that for all , we have that for all . This implies that for all . By weak convergence, we conclude that , implying in turn that . Sine was arbitrary in , using Lemma 2 again, we have .

We are now ready to show that a bounded TV-SDP attains its optimal value.

Theorem 3

If the TV-SDP in (1) is feasible and satisfies the boundedness assumption in (6), then there exists a feasible function that attains its optimal value.

Proof. Let opt denote the optimal value of (1), which is finite under the assumptions of the theorem. From (6), there exists a scalar such that any feasible solution to the TV-SDP satisfies for all a.e.. Hence, for any positive integer , there exists a feasible solution , with , such that

(8)

Let us now consider the sequence of linear functionals , which satisfies the conditions of Theorem 1. Therefore, a subsequence of the functions converges weakly to a limit . It is clear by weak convergence that achieves the optimal value to (1), and Theorem 2 guarantees that is feasible to the TV-SDP. Letting gives the desired result.

3 The Primal Approach: Polynomial Solutions to a TV-SDP

We switch our focus in this section to algorithmic questions. We show in Section 3.2 that when the data to our TV-SDP belongs to the class of polynomial functions, then the best polynomial solution of a given degree to the TV-SDP can be found by solving a semidefinite program of tractable size. This motivates us to study whether one can always find feasible solutions to a TV-SDP that are arbitrarily close to being optimal just by searching over polynomial functions. While this is not always true (see Example 1 below), in Section 3.1 we show that it is true under a strict feasibility assumption (see Definition 2).

Example 1

Consider the TV-SDP in (1) with ,

The resulting constraints read

The unique feasible solution to this TV-SDP, up to a set of measure zero, is

It is clear that is not continuous, let alone polynomial.

For the remainder of this paper, for a set and a nonnegative integer , we define to be the set of function whose entries are polynomials of degree , i.e.

3.1 Polynomials are optimal under a strict feasibility assumption

We show in this section that under the following strict feasibility assumption, the optimal value of the TV-SDP in (1) remains the same when the function class is replaced with .

Definition 2

We say that the TV-SDP in (1) is strictly feasible if there exists a function and a positive scalar such that

Theorem 4

Consider the TV-SDP in (1) with its optimal value denoted by opt. If the TV-SDP is strictly feasible, then there exists a sequence of feasible polynomial solutions such that

As we will shortly see in the proof, the strict feasibility assumption enables us to approximate any feasible solution of (1) by a continuous, and later polynomial, solution. We use mollifying operators to obtain the continuous approximation.

Definition 3 (See [1])

The mollifying operator , indexed by a nonnegative integer , is the linear operator defined by

where when and otherwise, and is so that .

Remark 1

To lighten our notation, we write instead of . We also remark that one can extend the definition of mollifying operators to functions that are not scalar valued by making them act element-wise. For example, the extension to spaces and would be defined as follows:

Any property of mollifying operators that we prove on scalar-valued functions below extends in a straightforward manner to functions that are vector or matrix valued.

Proposition 1 (See Theorem 2.29 in [1])

For all and all , the function is continuous. Moreover,

Furthermore, if is a continuous function of , then

Lemma 3

For for any , the mollifying operator satisfies the following properties:

  • For any , if , then .

  • For any and , as .

  • For any polynomial function and , let . Then

Proof. The proof of (a) simply follows from the fact that the function is nonnegative on .

To prove (b), let and . Notice that

Hence,

By the change of variable and in view of the evenness of the function we get

Therefore,

and the claim follows.

Let us now prove . Observe that, on the one hand, for every ,

On the other hand, from Proposition 1 (and continuity of ), we know that

and