## I Introduction

In this work, we study the existence and uniqueness of solutions to the covariance steering problem for discrete time Gaussian linear systems with a squared Wasserstein distance terminal cost. This instance of stochastic optimal control problem seeks for a feedback control policy that will steer the probability distribution of the state of the uncertain system, close to a goal multivariate normal distribution over a finite time horizon, where the closeness of the two distributions is measured in terms of the squared Wasserstein distance between them. In our previous work

[1], we have shown that the latter problem can be reduced into a difference of convex functions program (DCP) provided that the control policy conforms to the so-called state feedback control parametrization according to which the control input can be expressed as an affine function of the current state and all past states visited by the system. Whereas the focus in [1] was on the control design problem, in this work we focus on the analysis of the problem and in particular, addressing questions about the existence and uniqueness of solutions and the convexity (or lack thereof) of the performance index.Literature review: Early works on covariance control problems can be attributed to Skelton and his co-authors who mainly examined infinite-horizon problems in a series of papers (refer to, for instance, [2, 3, 4]). Recently, finite-horizon covariance control problems for Gaussian linear systems have received significant attention; the reader may refer to [5, 6, 7] for the continuous-time case and [8, 9, 10, 11, 12, 13, 14] for the discrete-time case. The covariance steering problem for continuous-time Gaussian linear systems with a Wasserstein distance terminal cost was first studied in [15] whereas the same problem but for the discrete-time case was studied in [1]. Both of these references present numerical algorithms (shooting method in [15] and convex-concave procedure in [1]) for control design but do not address theoretical questions regarding the existence and uniqueness of solution, or investigate convexity properties of the performance index.

Main contributions: Next we summarize the main contributions of this paper. First, we establish the existence of at least one global minimizer to the optimization problem. Subsequently, we derive first and second order conditions of optimality, and provide analytic expressions for the gradient and the Hessian of the performance index by utilizing specialized tools from matrix calculus (these analytic expressions may also facilitate the implementation of numerical optimization algorithms, and thus improve in practice the speed of convergence). Finally, we present a sufficient condition for the performance index to be a strictly convex function under which the optimization problem admits a unique solution. In particular, we show that when the terminal state covariance is bounded from above, with respect to the Löwner partial order over the cone of positive semidefinite matrices, by the covariance matrix of the goal normal distribution, then the Hessian of the performance index becomes a strictly positive definite matrix, which in turn implies that the performance index is a strictly convex function.

Outline of the paper: In Section II, we review a few important results from matrix calculus that we use throughout the paper. In Section III, we formulate the covariance steering problem with a Wasserstein distance terminal cost, and briefly outline its reduction into a DCP. Sections IV and V present the first and second order optimality conditions for the latter optimization problem along with a sufficient condition for the convexity of the performance index. Finally, Section VI concludes the paper with a summary of remarks and directions for future research.

## Ii Preliminaries

Here we collect some notations and background material that will come in handy throughout this paper.

#### Set and inequality notations

We denote the set of nonnegative integers as , and for any positive integer , let . We use the inequalities and in the sense of Löwner partial order.

#### Kronecker product, Kronecker sum, and the operator

The basic properties of Kronecker product will be useful in the sequel, including

(1) |

and that matrix transpose and inverse are distributive w.r.t. the Kronecker product. The vectorization operator

and the Kronecker product are related through(2) |

Furthermore,

(3) |

We will also need the Kronecker sum

where

is an identity matrix of commensurate dimension. For matrices

of appropriate size and non-singular, we have(4) |

which is easy to verify using the definition of Kronecker sum and (1), and will be useful later.

#### Commutation matrix

The commutation matrix is the unique symmetric permutation matrix such that

see e.g., [16]. Being orthogonal, satisfies

Therefore, is idempotent of order two. Two useful properties of are

Notice that

being symmetric orthogonal, its eigenvalues are

. Consequently, the matrix , which is also symmetric idempotent, has eigenvalues and .Another observation that will be useful is that commutes with “self Kronecker product or sum”, i.e., for any square matrix , we have

(5a) | ||||

(5b) |

which follows from the property of mentioned before. We also have

(6) |

To see (6), notice that equals

#### Matrix differential and Jacobian

The matrix differential and the vectorization are linear operators that commute with each other. We will frequently use the Jacobian identification rule [17, Ch. 9, Sec. 5], which for a given matrix function , is

(7) |

where is the Jacobian of evaluated at . In case is independent of , the Jacobain

is a zero matrix. Some Jacobians of our interest are collected in the Appendix.

#### Matrix geometric mean

Given two symmetric positive definite matrices and

, their geometric mean (see e.g.,

[18]) is the symmetric positive definite matrix(8) |

It satisfies intuitive properties such as , , .

#### Function composition and normal distribution

We use the symbol to denote function composition. The symbol denotes that the random vector has normal distribution with mean vector and covariance matrix .

## Iii Problem Set up

We consider a discrete-time stochastic linear system

(9) |

where , , and denote the state, control input, and disturbance vectors at time , respectively. It is assumed that the initial state is a normal vector and in particular, , where and , and in addition, the disturbance process is a sequence of independent and identically distributed random vectors for all and . We suppose that and are mutually independent for all , from which it follows that for all , where denotes the expectation operator. We assume that the matrices are full rank for all .

For , let

Then, we can write

(10) |

where the block (column) vector

(11) |

and for all with , the matrices , and (note that ).

The problem of interest is to perform minimum energy feedback synthesis for (9) over a time horizon of length , such that the distribution of the terminal state goes close to desired distribution where , are given. The mismatch between the desired distribution and the distribution of the actual terminal state is penalized as a terminal cost quantified using the squared 2-Wasserstein distance between those two distributions. We refer the readers to [1, Sec. II] for the details on problem formulation.

To recover the statistics of the terminal state from the concatenated state , the following relation will be useful:

(13) |

It was shown in [1] that the problem of discrete time covariance steering with Wasserstein terminal cost subject to (9) (or equivalently (10)), can be reduced to a difference of convex functions program, provided the control policy is parameterized as

(14) |

where , and the parameters of the control policy are , for all . The concatenated control input can be written as

(15) |

where , and

(16) |

The controller synthesis thus amounts to computing the optimal feedforward control and feedback gain pair .

In [1], the authors proposed a bijective mapping and back, given by

(17a) | ||||

(17b) | ||||

We have | ||||

(17c) |

With the new feedback gain parameterization , it was deduced in [1] that the optimal pair minimizes the objective , given by

(18) |

where is given, and

(19) |

and

(20) |

where

(21) |

and the block diagonal matrix .

###### Proposition 1.

Consider as in (21). Then .

###### Proof.

From (21), it is clear that . Suppose if possible that is singular. Then there exists vector such that , which in turn, is possible iff and , since , .

Now let where the sub-vector for all . From , we get since the matrices are full rank per our assumption. In , substituting , yields . Thus, which contradicts our hypothesis. Therefore, the positive semidefinite matrix is nonsingular, i.e., . ∎

###### Remark 1.

An important consideration is that in order to ensure the causality of the control policy, the matrix should be constrained to be block lower triangular of the form

(22) |

where for all index pairs .

The block lower triangular condition on in Remark 1 can be equivalently expressed as

(23) |

We transcribe this constraint in terms of the decision variable as

(24) |

where and are defined as block vectors whose ^{th} and ^{th} blocks are equal to the identity matrices of suitable dimensions; all the other blocks are equal to the zero matrix.
For example,

where denotes an identity matrix of size .

It is clear that (19) is a convex quadratic function in its arguments with Lipschitz continuous gradient. The squared Wasserstein distance (III) is a difference of convex functions in , and it can be shown that it is also Lipschitz continuous gradient. Thus, the objective in (18) is a difference of convex functions in the decision variables, and as such, it is unclear when it might in fact be convex. In [1], we used convex-concave procedure [19] to numerically compute the optimal solution. In our numerical experiments, we observed multiple local minima which motivates investigating the conditions of optimality for (18). This is what we pursue in Sections IV and V. Before doing so, we show that the objective in (18) is not convex in general but there exists a global minimizer.

###### Proposition 2.

###### Proof.

The objective in (18) is continuous and coercive (i.e., ) in its arguments.

That is continuous in is immediate. To establish coercivity, following [1, see equation (26)], we write

(25) |

where

(26a) | |||

(26b) | |||

(26c) | |||

(26d) |

Since in (26a) is strictly convex quadratic in , it is clear that as .

We note that equals due to invariance of the trace operator under cyclic permutation. Using (2) and (3), we then write

(27) |

Since (by Proposition 1), we have . Thus, is a strictly convex quadratic function and as .

Finally, since comes from the expression of the squared Wasserstein distance which is lower bounded by zero, the function for all . Thus, , i.e., the function in (25) is coercive.

Notice that Proposition 2 only guarantees the existence of global minimizer; it does not guarantee uniqueness. The following example shows that in general, is nonconvex, and there might be multiple local minima which makes it challenging to find the global minimizer.

###### Example 1.

(Nonconvexity of ) Consider the system matrices

with time horizon . The initial and desired mean vectors are , , respectively. The initial covariance is . With this data, for two different desired distributions and with

we numerically computed (using convex-concave procedure, see [1, Sec. IV]) the minimizers () and (). Since the desired mean vector is the same in both cases, .

For , define an affine function and let . The function is convex iff its restriction to a line, i.e., is convex. Fig. 1 shows that the function has multiple local minima, thus the function is nonconvex.

## Iv First Order Conditions for Optimality

Recall the function in (25) and (26). We define the index set

Now consider the Lagrangian

(28) |

where is the Lagrange multiplier matrix associated with the ^{th} linear equality constraint (24) for all , and denotes the Frobenius inner product. Let us denote the optimal pair as . The first order necessary conditions for optimality are

We next compute the gradients of w.r.t. the vector variable and the matrix variable , respectively, and use them to determine the pair .

### Iv-a The optimal feedforward control

### Iv-B The optimal feedback gain

From (25), (26) and (28), we have

(31) |

Notice that

(32) |

which follows from the invariance of trace under cyclic permutation, and the from the fact that the directional derivative (in the matricial direction )

Furthermore, let , , , and notice that

(33) |

From the chain rule of Jacobians, we have

(34) |

wherein using Lemma 1 and 2 from Appendix, we get

(35a) | ||||

(35b) |

Combining (34) and (35), we obtain

(36a) | |||

(36b) |

Comments

There are no comments yet.