# Algorithms for Globally-Optimal Secure Signaling over Gaussian MIMO Wiretap Channels Under Interference Constraints

Multi-user Gaussian MIMO wiretap channel is considered under interference power constraints (IPC), in addition to the total transmit power constraint (TPC). Algorithms for global maximization of its secrecy rate are proposed. Their convergence to the secrecy capacity is rigorously proved and a number of properties are established analytically. Unlike known algorithms, the proposed ones are not limited to the MISO case and are proved to converge to a global rather than local optimum in the general MIMO case, even when the channel is not degraded. In practice, the convergence is fast as only a small to moderate number of Newton steps is required to achieve a high precision level. The interplay of TPC and IPC is shown to result in an unusual property when an optimal point of the max-min problem does not provide an optimal transmit covariance matrix in some (singular) cases. To address this issue, an algorithm is developed to compute an optimal transmit covariance matrix in those singular cases. It is shown that this algorithm also solves the dual (nonconvex) problems of globally minimizing the total transmit power subject to the secrecy and interference constraints; it provides the minimum transmit power and respective signaling strategy needed to achieve the secrecy capacity, hence allowing power savings.

There are no comments yet.

## Authors

• 5 publications
• 8 publications
• 69 publications
• ### The Capacity and Optimal Signaling for Gaussian MIMO Channels Under Interference Constraints (full version)

Gaussian MIMO channel under total transmit and interference power constr...
02/18/2020 ∙ by Sergey Loyka, et al. ∙ 0

• ### On The Capacity of Gaussian MIMO Channels Under Interference Constraints (full version)

Gaussian MIMO channel under total transmit and multiple interference pow...
05/14/2020 ∙ by Sergey Loyka, et al. ∙ 0

• ### Comments on "Precoding and Artificial Noise Design for Cognitive MIMOME Wiretap Channels"

Several gaps and errors in [1] are identified and corrected. While accom...
10/24/2020 ∙ by Mahdi Khojastehnia, et al. ∙ 0

• ### Efficient Numerical Methods for Secrecy Capacity of Gaussian MIMO Wiretap Channel

This paper presents two different low-complexity methods for obtaining t...
02/20/2021 ∙ by Anshu Mukherjee, et al. ∙ 0

• ### On The Capacity of Gaussian MIMO Channels Under The Joint Power Constraints

The capacity and optimal signaling over a fixed Gaussian MIMO channel ar...
08/31/2018 ∙ by Sergey Loyka, et al. ∙ 0

• ### On the MIMO Secrecy Capacity of MIMO Wiretap Channels: Convex Reformulation and Efficient Numerical Methods

This paper presents novel numerical approaches to finding the secrecy ca...
12/10/2020 ∙ by Anshu Mukherjee, et al. ∙ 0

• ### A Rotation-based Method for Precoding in Gaussian MIMOME Channels

The problem of maximizing secrecy rate of multiple-input multiple-output...
08/02/2019 ∙ by Xinliang Zhang, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Ever-growing number of wireless users and their traffic, open system architectures and aggressive frequency re-use as well as operation in unlicensed bands envisioned in 5G systems [1] create significant potential for inter-user interference, which needs to be carefully controlled and mitigated. Multiple antennas offer a significant potential for doing so in the space domain, especially in the context of massive MIMO [2]. This approach to interference mitigation and control has been investigated earlier in the context of cognitive radio (CR) [3], where secondary users are allowed to use the same bandwidth as primary users (who are the license holders) but are required to cause no significant interference to them. On the other hand, open system architectures and co-existence of several users in the same bandwidth in combination with the broadcast nature of wireless channels make transmissions vulnerable to eavesdropping of confidential information (e.g. e-commerce and e-health, mobile banking, Internet transactions, etc.) so that some form of secrecy protection is needed. In this context, physical-layer security approach has emerged as a valuable complement to the traditional cryptography-based approach for modern wireless networks [4]-[6]. In this approach, the secrecy of communications is ensured at the physical layer by exploiting the properties of wireless communication channels so that no transmitted information can be recovered by malicious eavesdroppers. Wiretap channel (WTC) is widely used as a model of secrecy communications and its secrecy capacity became the key metric of performance [4]-[9].

### I-a Literature review

Using this approach in combination with MIMO systems offers significant new opportunities for enhancing the secrecy of multi-user wireless systems via space-domain processing. The MIMO WTC model became a popular tool to study physical-layer security, where the transmitter (Tx) sends confidential information to the receiver (Rx) while an eavesdropper (Ev) observes the transmission. The main performance metric, which is an ultimate upper bound to reliable and secret communications, is the secrecy capacity, defined operationally as the maximum achievable rate on the Tx-Rx link subject to the reliability (low error probability) and secrecy (low information leakage on the Tx-Ev link) criteria

[4]-[6]. The secrecy capacity of Gaussian MISO (multiple-input single-output) WTC has been established in [7] and further extended to the full MIMO case in [8][9], where the optimality of Gaussian signaling has also been established.

Hence, finding the secrecy capacity amounts to finding an optimal input (transmit) covariance matrix. This problem is still open analytically in the general case since the underlying optimization problem is not convex and hence very hard to solve, either numerically or analytically, while some special cases (MISO, full-rank MIMO, rank-1 MIMO, weak eavesdropper, identical right singular vectors of Rx and Ev channels, etc.) have been solved

[7]-[13]. The two Tx antennas case was studied in details in [14], the massive MIMO setting was considered in [15][16], and finite-alphabet signaling was also studied [17]; an overview of recent results can be found in [5][6][17].

The Gaussian MISO WTC with multiple eavesdroppers was considered in [18], where the original non-convex problem was transformed to a quasi-convex one that can be solved as a sequence of convex feasibility problems using the bisection method. This MISO case with multiple Evs was also studied in [19], including a deterministic channel uncertainty model, where the original non-convex problem was transformed to a convex semi-definite one using determinantal inequality and the fact that optimal covariance is of rank-1 so that the inequality becomes equality; this new problem can be efficiently solved using existing convex solvers. The MISO channel with multi-eavesdroppers and stochastic channel uncertainty was studied in [20]. Unfortunately, the multi-Ev studies above did not establish the optimality of Gaussian signaling (but rather assumed it) and they cannot be extended to the full MIMO case since there exists no equivalent scalar channel anymore and optimal covariance is not necessarily rank-1. New approaches are needed. In this respect, the multi-Ev case, where eavesdroppers are not cooperative, is equivalent to a compound WTC whose operational capacity was established in [21][22] by demonstrating that Gaussian signaling is optimal. However, an optimal Tx covariance matrix is not known in the general case either.

Since there is no closed-form solution in the general case, even for a single eavesdropper, a number of numerical algorithms have been developed to maximize secrecy rates [23]-[25]. As the original problem is non-convex, these algorithms use some form of convexification, where the non-convex part of the objective (the Ev part) is expanded in a Taylor series and only first two terms are kept (i.e. the non-convex part is linearized, either explicitly or implicitly) [23]-[25]. Then, the approximated but convex problem is solved, an expansion point is iteratively updated and the process is repeated. The fundamental difficulty with this approach is that, even if the algorithm can be proved to converge, a convergence point is just a Karush-Kuhn-Tucker (KKT) point, but, due to the non-convex nature of the original (not approximate) problem, the KKT conditions are not sufficient for global optimality. Hence, a convergence point of these algorithms can be a local rather than global maximum, an inflection point, a local or even global minimum [44]-[46]. All these algorithms lack provable convergence to a global optimum due to the non-convex nature of the original problem and no way is known to overcome this fundamental difficulty. Furthermore, the gap to a global optimum is not known either.

Using a different approach, an algorithm with provable convergence to a global optimum in the general MIMO case (with single eavesdropper) was proposed in [26]. The key idea was to avoid any form of convexification/approximation or alternating optimization (for which proving convergence to a global optimum is out of reach), but rather to use the max-min formulation in [8][9]

, without any approximations. However, this algorithm cannot be used in interference-constrained environments (e.g. CR) due to three fundamental issues: (i) while the feasible set is isotropic under the Tx power constraint (TPC) alone (no limits on eigenvectors, only on the sum of eigenvalues of the Tx covariance matrix), it is not isotropic anymore when interference power constraints (IPC) are added and this has a dramatic impact on the KKT conditions and numerical algorithms used to solve them; (ii) any of the constraints, including TPC, can be inactive under IPCs while the TPC is

always active without IPC; furthermore, it is not known in advance which constraint is active and which is not so that an algorithm is required to determine this automatically; finally, (iii) a global convergence proof must include the interference constraints and the fact that some of them may be inactive.

An interference-constrained Gaussian MISO WTC with a single Ev was studied in [27]. It was shown that Gaussian signalling is optimal and the operational secrecy capacity can be expressed as a quasi-convex optimization problem, which can be subsequently reduced to a sequence of convex feasibility problems [27] and they can be further solved using existing convex solvers. An imperfect channel state information (CSI) was accounted for in [28]. Secrecy rate maximization of interference-constrained MISO (single-antenna Rx) WTC under single or multiple non-cooperative Evs and various channel assumptions (fixed, quasi-static or ergodic fading with full or partial channel state information) was studied in [29]-[32]; artificial noise and various beamforming solutions were proposed to maximize the secrecy rate.

However, it is not known whether these solutions are optimal, i.e. achieve the secrecy capacity, and what is the actual gap to the capacity. In addition, all these studies are limited to the MISO case, i.e. single-antenna receivers, and cannot be extended to the full MIMO case due to the fundamental limitations of the approach they use, i.e. transforming a MISO channel into an equivalent scalar channel and reducing (or relaxing) the original non-convex problem to a convex or quasi-convex one. In the full MIMO case, there is no equivalent scalar channel, beamforming is not an optimal strategy in general, the original problem is not convex and it is not known how to transform it into an equivalent convex or quasi-convex problem.

### I-B Contributions

Thus, a new approach is needed to deal with the full MIMO WTC under interference constraints. Unlike the previous studies in [27]-[32], in this paper we target capacity-achieving signaling over the full MIMO WTC under interference constraints and to this end develop an algorithm with provable convergence to a global optimum.

The proposed algorithm is based on the max-min secrecy capacity characterization originally developed in [8][9] under the TPC alone and later extended to the joint constraints (TPC+IPC) in [33], where the optimality of Gaussian signaling was established in the general MIMO case under the joint constraints and the max and max-min secrecy capacity characterizations of [8][9] were shown to hold as well (even though the feasible set under the joint constraints is not isotropic).

However, no analytical solutions to either the max or max-min problems above are known in the general case (some special cases have been solved in [33][34], but the general case remains an open problem). No algorithmic solution with provable convergence to global optimum under the joint constraints is known either. Therefore, a numerical algorithm is needed to solve the problems and thus to find an optimal Tx covariance matrix and the secrecy capacity. Such algorithm is proposed in the present paper. The distinct features of this new algorithm are that (i) it finds globally-optimum (i.e. capacity-achieving) transmit covariance matrix, (ii) its convergence to a global (rather than local) optimum is rigorously proved, and (iii) it is of polynomial complexity.

It should be emphasized that the standard max-only formulation of the secrecy rate maximization problem, which is dominant in the current literature, see e.g. [23]-[25], does not allow one to build an algorithm with guaranteed convergence to a global optimum in the general case due to the lack of problem’s convexity (which makes provable global convergence out of reach, see e.g. [35][44]-[46]). Algorithms based on the max-only formulation face a fundamental difficulty since they may get trapped in a local optimum and hence their performance may be rather poor. For example, we show in Fig. 5 that the Taylor expansion-based sub-optimal algorithm as in [25] does get trapped at local optima (or stationary points), far away from the global one, resulting in poor performance and hence should be used with caution (or avoided at all) when the original problem is not convex. In general, non-convex problems are NP-hard (of exponential complexity) and the best one can hope for is convergence to a stationary point, which can be a local (rather than global) maximum, an inflection point or even a local minimum [44]-[46]. A convergence point may also depend on initial (starting) point, so that some bad initial points may result in bad results (e.g. a local minimum rather than maximum). The only known exception to this is the MISO case, where problem re-formulation is possible to a quasi-convex or some other tractable form, but this re-formulation is not possible in the general MIMO case (since there exists no equivalent scalar channel). On the other hand, the max-min characterization of [8][9][33], while appearing to be more complicated due to two conflicting optimizations, is in fact more tractable due to its convex-concave nature.

In this paper, we use the max-min characterization to construct an algorithm with provable convergence to a global optimum in the general MIMO case. This algorithm includes three key components: (i) the residual-form Newton method, (ii) the barrier method and (iii) backtracking line search. The barrier method is needed to absorb inequality constraints into the objective function, while the residual-form Newton method, in combinations with backtracking line search, generates a sequence of points which converge to a globally-optimal max-min point for which the objective value is the secrecy capacity. When combined properly, they are proved to converge to a globally-optimal solution with any desired accuracy. In practice, only a small to moderate number of Newton steps is needed to achieve a high precision level.

While the algorithm above computes the secrecy capacity via a saddle-point of the max-min problem, its optimal covariance matrix is not necessarily a maximizer of the secrecy rate and hence cannot be used for globally-optimal (i.e. capacity-achieving) signaling in the general case. This unusual effect is entirely due to the interplay between TPC and IPC and cannot be found under the TPC alone, as in [8][9][26]. To address this issue, we establish general properties of the secrecy capacity as a function of Tx power and, based on it, develop an iterative bisection algorithm in Section IV (Algorithm 2), which evaluates numerically an optimal covariance in the general case with any desired accuracy and prove its convergence. Numerical experiments show that the proposed algorithms converge fast in practice and achieve higher secrecy rates (significantly higher when the channel is not degraded and its negative eigenmode is dominant) than the known sub-optimal algorithms.

Motivated by energy efficiency issues, dual problems of minimizing globally the total transmit power subject to secrecy and interference power constraints are considered in Section V. Since these problems are not convex, standard tools of convex optimization do not apply and they are difficult to solve (where ”solve” means finding global rather than local optimum). Yet, Proposition 6 shows that Algorithm 2 solves these problems as well. This provides the globally-minimum Tx power and respective signaling strategy needed to achieve a target secrecy rate under interference constraints.

Collectively, the two proposed algorithms evaluate the secrecy capacity and globally-optimal signaling strategy to achieve it in the interference-constrained multi-user Gaussian MIMO wiretap channel in the general case. This is substantially different from the known algorithms in [23]-[25][27][28], which either operate over a MISO channel only or which converge to a stationary (KKT) point only, which can be a local rather than global maximum, an inflection point, a local or even global minimum, and for which a proof of convergence to a global optimum is out of reach.

The rest of the paper is organized as follows. Section II introduces the channel model and gives its operational secrecy capacity; the model is general enough to include per-antenna power constraints as well. The algorithm for global maximization of secrecy rates over interference-constrained multi-user Gaussian MIMO wiretap channel is developed in Section III and its convergence is rigorously proved. Based on this algorithm, Section IV presents a bisection-based algorithm to evaluate numerically an optimal Tx covariance with any desired accuracy in the general case and its convergence is proved. Dual problems of minimizing the total transmit power subject to secrecy and interference power constraints are considered in Section V and Algorithm 2 is shown to solve these problems as well. Finally, numerical experiments to illustrate algorithms’ performance and practical convergence are given in Section VI.

Notations: bold lower-case letters () and capitals () denote vectors and matrices respectively; denotes positive semi-definite matrix ; is transposition while is Hermitian conjugation; is the trace; is the vector obtained by stacking all columns of matrix on top of each other and is the vector obtained by vectorizing only the lower triangular part of ; is a diagonal matrix with the same diagonal entries as in ; is a statistical expectation; is the Kronecker product; and are the Euclidian norm of vector and determinant of matrix ;

is the identity matrix of appropriate size.

## Ii Channel Model and Secrecy Capacity

Let us consider the standard Gaussian MIMO WTC model as shown in Fig. 1, where the transmitter (Tx) sends confidential information to the receiver (Rx) while eavesdroppers (Ev), who may be just other users in a multi-user system, intercept the transmission; the Evs are assumed to be cooperative, which is the most conservative assumption in terms of secrecy111This cooperation is possible in e.g. cloud radio access networks (C-RAN), where users’ baseband data is centrally stored and processed [41]-[43] and hence a malicious user (super-Ev) can exploit it for eavesdropping.. The objective is to ensure reliable communications between the Tx and Rx (the reliability criterion) while keeping the Evs ignorant about transmitted information (the secrecy criterion). In an interference-constrained (IC) multi-user environment, such as cognitive radio, the interference generated by the Tx to primary receivers (PR), who represent licensed users of the system, must not exceed certain thresholds. The secrecy capacity is defined operationally as the largest transmission rate on the Tx-Rx link subject to the reliability and secrecy criteria [4]-[9], where the reliability criterion ensures arbitrary low error probability at the Rx while recovering the transmitted message; the secrecy criterion ensures arbitrary low information leakage to the Evs. The Tx has antennas, while the Rx and each Ev have and () antennas, respectively. In the discrete-time AWGN MIMO channel model, the signals received by the Rx and each Ev can be expressed as

 y1=H1x+ξ1,y2i=H2ix+ξ2i (1)

where are the respective received signals at Rx (-th Ev), is the transmitted signal,

represent zero-mean unit-variance i.i.d. noise at the Rx (

-th Ev) end; are the channel matrices collecting channel gains from the Tx to the Rx (-th Ev). In addition to this and following the interference-constrained model, there are PRs equipped with () antennas each. The received signal at -th PR is similarly expressed as

 y3j=H3jx+ξ3j (2)

where and are the channel matrix and zero-mean unit-variance i.i.d. noise. For future use, let and let

. We assume that the full channel state information (CSI) is available to the Tx, Rx and each Ev (which is motivated by modern adaptive system design, where channel is estimated at the Rx and send back to the Tx via a feedback link; when Evs are just other users in the system, they also share their CSI with the base station).

Overall, the transmission is subject to the TPC and multiple IPCs, so that any Tx covariance matrix must be in the following feasible set :

 SR={R≥0:tr(R)≤PT,tr(W3jR)≤PIj ∀j} (3)

where are the maximum allowed transmit and interference powers at the Tx and each PR respectively, termed here the TPC and IPC powers. The IPC

 tr(W3jR)=tr(H3jRH+3j)≤PIj, j=1,2..K, (4)

ensures that the total interference power at the th PR does not exceed the IPC power so that this PR’s performance is not distorted. This type of interference constraints has been widely adopted in the literature for regular systems (no secrecy) [3][49]-[51] as well as for secrecy systems [18][23][27]-[32]. In this multi-user environment, the secrecy capacity of the interference-constrained WTC is defined operationally as the largest achievable rate on the Tx-Rx link subject to the secrecy, reliability, transmit and interference power constraints simultaneously. Note that per-antenna power constraints (as in e.g. [47][48]), in addition to or instead of the TPC, can also be accommodated by setting some to be diagonal matrices with 0-1 entries.

### Ii-a Secrecy Capacity of Interference-Constrained MIMO WTC

Since the Evs are cooperative, their received signals can be aggregated into a single vector resulting in a single meta-Ev as follows:

 y2=H2x+ξ2 (5)

where , are aggregated received signals and channel matrices, respectively, , , . While the MISO case was considered in [27], its approach cannot be extended to the full MIMO case since an optimal covariance is not necessarily of rank-1 and there is no equivalent scalar channel allowing quasi-convex reformulation of the original non-convex problem.

A different approach was adopted in [33]: it is based on the max-min characterization of the secrecy capacity originally developed in [8][9] under the TPC alone and its further extension to the case of the joint constraints (TPC+IPC). This established the operational secrecy capacity of the interference-constrained MIMO WTC above as follows (Gaussian signaling is still optimal in this setting).

###### Theorem 1.

The operational secrecy capacity of interference-constrained Gaussian MIMO WTC in (1), (5) and (2) under the TPC and the IPCs in (3) can be expressed as

 C=maxR∈SRC(R)=maxR∈SR minK∈SKf(R,K)(P1) (6)

where

 C(R)=ln|I+W1R|−ln|I+W2R|, (7) f(R,K)=ln|I+K−1HRH+|−ln|I+W2R|, (8)

and , is a set of noise covariance matrices of the form

 SK={K:K=[INN+I],K≥0}, (9)

where is noise cross-covariance.

We emphasize that Theorem 1 characterizes the operational secrecy capacity (the largest achievable secrecy rate) rather than an information capacity defined formally as the difference of two mutual information terms, as sometimes done in the literature (without proving its operational significance). The case of multiple non-cooperating Evs corresponds to a compound WTC (see e.g. [21][22]) and is much more difficult for analysis; its secrecy capacity is not known under interference constraints in general (it is not even known whether Gaussian signaling is optimal). The capacity above is a lower bound to that of the non-cooperative case, since Evs cooperation, while having no effect on the Rx error probability, cannot decrease the information leakage and hence cannot increase the secrecy capacity. However, if there exists a dominant Ev, as in [33, Proposition 8], then Theorem 1 still holds, with being the channel Gram matrix of the dominant Ev.

Theorem 1 provides two equivalent characterizations of the secrecy capacity: as a max problem or as a max-min problem. While the first characterization appears to be easier for numerical optimization and is indeed widely-used in the existing literature [23]-[25], it makes it virtually impossible to prove convergence to a global optimum since the max problem is not convex (since is not concave, unless the channel is degraded, see e.g. [8]), and hence its KKT conditions are not sufficient for global optimality. Provable convergence to a global optimum in this case is out of reach [35][44]-[46].

Here, we adopt a different approach based on the max-min representation of the secrecy capacity in (6), denoted below as (P1). While this representation involves 2 conflicting optimizations, it is actually easier for numerical optimization, since both optimizations are convex as is concave in for any fixed and is convex in for any fixed (see [8][12][33] for further details) and, hence, the respective KKT conditions for both optimizations are jointly sufficient for global optimality. This opens up a path to develop an iterative algorithm, based on this representation, with a probable convergence to a global (rather than local) optimum.

We caution the reader that while the optimal values of the max and max-min problems in (6) are the same, the respective optimal covariances are not necessarily the same, i.e. and in those cases (see (34)-(38) for an example), where and are optimal points of the max and max-min problem respectively,

 R∗ =argmaxR∈SRC(R), (R′,K′) =argmaxR∈SR minK∈SKf(R,K) (10)

In some (singular) cases, the difference can be significant (see Fig. 5 and 8). This phenomenon never appears without IPCs (under the TPC alone), as in [8][9][26], where both problems always share the same optimal covariance matrix, . To address this issue, Algorithm 2 is developed in Section IV, which computes iteratively an optimal covariance matrix in these singular cases. Its convergence is also proved. Finally, it should also be noted that is not necessarily unique (see e.g. Example 2 in [34]), which motivates the power minimization problem (P4) in (49).

## Iii Capacity-Achieving Signaling Under Interference Constraints

In this section, we propose an iterative algorithm to solve (P1) numerically and prove its convergence to a global optimum. Performing separately and optimizations in the max-min part of (6) immediately faces a serious and fundamental difficulty of achieving or proving convergence of the algorithm due to its oscillatory behaviour, which is due to conflicting (max-min) optimization operations. To overcome this difficulty, we use the residual form of Newton method where both optimizations (max and min) are done simultaneously, so that the residual of the KKT conditions is reduced at each iteration and it converges monotonically to zero as the algorithm progresses (see e.g. [35] for more details on this general approach). This opens up a way to a provable convergence to a global optimum, which is out of reach for the max problem in (6) due to its non-convex nature.

We develop below an iterative algorithm, which is able to handle any number of interference constraints and which does not require advance knowledge of which constraint is active and which is not. This algorithm is based on the max-min representation of the secrecy capacity in (6) and includes the barrier method, the residual-form Newton method and the backtracking line search, see e.g. [35] for more details on these algorithms. Unlike generic convex optimization algorithms or solvers, our algorithm here is specifically tailored for secrecy rate maximization in multi-user Gaussian MIMO WTC under interference constraints. Its convergence to a global optimum is rigorously proved, even when the WTC is not degraded and hence the max problem in (6) is not convex. This is a distinct advantage not found in other known algorithms, e.g. in [23]-[25], where either no convergence at all is proved or where only convergence to a stationary point is proved, which is not necessarily a global maximum, as discussed above.

The key idea of the barrier method is to substitute the original objective function by a modified one , which includes additional barrier terms as follows:

 ft(R,K)=f(R,K) +I1(R)+I2(R) +∑jI3j(R)−I4(K) (11)

where is the barrier parameter and

 I1(R)=t−1ln|R|, (12) I2(R)=t−1ln(PT−tr(R)), (13) I3j(R)=t−1ln(PIj−tr(W3jR)), (14) I4(K)=t−1ln|K|. (15)

so that all inequality constraints are absorbed in the respective barrier terms . Note that the domain of is where

 S′R={R∈SR:R>0, tr(R)0}, (16)

i.e. are strictly inside of the original feasible sets (but may approach the boundary arbitrary closely - this is a key feature of the barrier method). Note also that is convex-concave in the right way, i.e. concave in for any fixed and convex in for any fixed , so that the respective optimization problems are convex and their KKT conditions are jointly sufficient for global optimality.

In the proposed algorithm, we use the residual-form Newton method to compute an optimal point of (P2) below for a fixed in an iterative way and with high accuracy. To facilitate implementation, we use real rather than complex variables. To reduce the number of variables and improve the efficiency, we exploit the symmetry of and and use and as independent variables to represent and , where operator stacks all columns of on top of each other and does so for the lower-triangular part of . Since is used as independent variables to represent , the equality constraint in (9) is satisfied automatically. The original max-min problem (P1) in (6) is transformed into the following unconstraint problem:

 (P2)  maxx miny ft(R,K) (17)

so that its KKT conditions are simply the stationarity conditions:

 r(z)=∇zft=0 (18)

where

 z=[xy], r(z)=[∇xft∇yft] (19)

are the aggregate vector of the variables and the residuals respectively. In the residual-form Newton method, the optimality condition is iteratively solved using 1st-order approximation of at each step (which corresponds to the second-order approximation of the objective):

 r(zk+Δz)=r(zk)+DrΔz+o(Δz)=0. (20)

where and are the current variables and their updates respectively at iteration , and where is the derivative of , i.e. the Hessian of :

 Dr=[∇2xxft∇2xyft∇2yxft∇2yyft]. (21)

Closed-form expressions for gradients and Hessians are given in the Appendix. By ignoring , (20) can be reduced to a system of linear equations in :

 r(zk)+DrΔz=0 (22)

which can be solved numerically using any of the existing (and efficient) techniques. When Hessian is non-singular, in (22) has a unique solution. In our case, the non-singularity of at each step of the Newton method is rigorously established below. After computing from (22), is updated as follows

 zk+1=zk+sΔz (23)

where denotes the Newton iteration number (step) and where is the step size, which can be found via backtracking line search (see e.g. [35] for a background on this method). The Newton method in combination with the backtracking line search is guaranteed to reduce the residual norm at each step, which follows from the respective norm-reduction property [35], so that for sufficiently small , the residual norm shrinks at each iteration approaching as increases. After several iterations, the convergence becomes quadratic (see [35] for related definitions and analysis) and hence very fast, so that the optimal point of the problem (17) can be approached with any desired accuracy in a small to moderate number of steps. Following the barrier method, the problem in (17) is solved for sequentially increasing , where the optimal point of the previous serves as an initial point for the new, increased , thus minimizing the total number of Newton iterations required [35]. It can be shown that as so that any desired accuracy can be reached (see Proposition 2 below).

The proposed algorithm is shown below, where is the percentage of the linear decrease in the residual norm one is willing to accept in the backtracking line search; and are the parameters controlling reduction in step size and increase in barrier parameter at each iteration of the respective loop of the algorithm, is the target residual accuracy, and are initial and maximum values of the barrier parameter; varies from to , where the latter controls the accuracy of the barrier method so that the inaccuracy in the secrecy capacity due to the barrier method does not exceed . is an initial point defined as follows

 x0=veh(PTI/a), y0=0 a=2max{m,{tr(W3j)PT/PIj}}, (24)

so that . Note that represents isotropic signaling satisfying all power constraints and represents uncorrelated noise. Numerical experiments show that this initial point results in fast convergence in all studied cases. While the barrier method generates a sequence of which are strictly inside the feasible set (e.g. non-singular), they may approach the boundary arbitrary closely, thus representing a rank-deficient solution. In this case, non-zero but very small eigenvalues of can be rounded off to zero facilitating low-complexity (low-rank) implementation, which includes beamforming as a special case.

### Iii-a Analysis of Algorithm 1

In this section, we prove the convergence of Algorithm 1 to a globally-optimal solution of the problem in (6) using the steps of the convergence analysis in [35] and adapting them properly to the current setting. First, from the residual norm-reduction property of the Newton method (see Sec. 10.3 in [35]),

 dds|r(zk+sΔz)|=−|r(zk)|≤0 (25)

so that the termination condition of the backtracking line search in Algorithm 1 is satisfied for a sufficiently-small ,

 |r(zk+sΔz)|=(1−s)|r(zk)|+o(s)≤(1−αs)|r(zk)| (26)

where , and hence

 |r(zk+1)|≤|r(zk)| (27)

so that is a decreasing sequence that converges (since it is bounded from below by 0); from (25), a convergence point is 0 (otherwise, could be further reduced as the inequality in (27) is strict if ), i.e. to a point that solves the KKT conditions. This point is globally-optimal since the KKT conditions are sufficient for global optimality of (P2) due to the convex-concave nature of , as explained above.

It remains to show that, (i) at each step of the Newton method, (22) can be solved to obtain update , and that (ii) will approach arbitrary closely as increases, where is an optimal (saddle) point of (P1).

To establish first point, it is sufficient to show that the Hessian is non-singular at each Newton step

###### Proposition 1.

Consider the max-min problem in (17). Its Hessian as defined in (21) is non-singular for each , .

###### Proof.

See Appendix. ∎

In fact, Proposition 1 ensures that the update equation (22) has a unique solution at each Newton step. To demonstrate second point, we give below a sub-optimality bound for the barrier method, from which it follows that as .

###### Proposition 2.

For each , the gap of the barrier method used in (17) can be upper bounded as follows:

 |f(R(t),K(t))−C|≤max(mR,nK)/t (28)

where are the optimal signal and noise covariance matrices returned by the barrier method for a given ; , and is the number of IPCs.

###### Proof.

To establish the bound, consider first the min part of (P1) in (6) for a fixed and use the analysis of the barrier method in Sec. 11.6 of [35] to obtain an upper bound:

 f(R(t),K(t)) ≤minK∈SKf(R(t),K)+nK/t ≤maxR∈SRminK∈SKf(R,K)+nK/t (29)

where accounts for the constraint . Consider now the max part of (P1) for a fixed and use the same approach to obtain

 f(R(t),K(t)) ≥maxR∈SRf(R,K(t))−mR/t ≥minK∈SKmaxR∈SRf(R,K)−mR/t (30)

where ; accounts for the positive semi-definite constraint , while and account for the TPC and IPC respectively. Combining these two bounds, one obtains

 C−mR/t≤f(R(t),K(t))≤C+nK/t (31)

from which (28) follows, where we have used the saddle-point property of the problem (P1):

 (32)

which follows from Von Neumann mini-max theorem [33]. ∎

Note that the original objective is used in (28), not the modified one . The bound in (28) can be used in practice to set up to meet a target accuracy in terms of the achieved secrecy rate : if is needed, then setting

 tmax≥max(mR,nK)/ΔC (33)

will satisfy this requirement.

## Iv Optimal Covariance in the Singular Case

Algorithm 1 can be used to evaluate the secrecy capacity in the general case by evaluating numerically at saddle point (i.e. the optimal point of the max-min problem (P1) in (6)). However, is not necessarily an optimizer of , i.e. not an optimal Tx covariance , so that is possible. This happens when the TPC is inactive and, for all active IPCs, the sum is singular (i.e. the intersection of their null spaces is not empty). The following example [33] illustrates this point. Let

 H1=diag{1,0}, H2=diag{0,1}, W3=diag{1,0} (34)

It is straightforward to see that the saddle-point is

 K′=I, R′=diag{min(PT,PI),a} (35)

where is any in the interval , so that is not unique if . The optimal covariance is

 R∗=diag{min(PT,PI),0} (36)

Thus, (unless ) and

 C=f(R′,K′)=ln(1+min(PT,PI)) (37)

for any . However, if one sets , then

 C(R′)=ln1+min(PT,PI)1+(PT−PI)+

where the inequality holds if (negative is interpreted as zero rate). Hence, is not an optimal transmit covariance (one maximizing the secrecy rate ). We conclude that while the application of Algorithm 1 is possible to find the secrecy capacity via , it cannot be used to find in the singular case, since is possible and, furthermore, is also possible (as a side remark, we note that this effect disappears if the IPCs are removed, since the TPC is always active in this case).

Therefore, the singular case needs special treatment to establish an optimal signaling strategy (optimal covariance), not just the capacity. This is done below via Algorithm 2, which incorporates Algorithm 1 and bisection search to find an optimal covariance as well as the least Tx power required to achieve the secrecy capacity (this may be smaller than the TPC power in the singular case).

To this end, let be the secrecy capacity as a function of TPC power , with all interference constraint powers being fixed, and let

 P0=min{P:C(PT)≤C(P) ∀PT≥0} (39)

so that , i.e. saturates at as increases; corresponds to no saturation. It follows from the definition of that is the minimum Tx power required to achieve the capacity . Note that if , i.e. Tx power saving is possible and hence it is important to evaluate as well.

The following general properties of the function are needed below to construct an algorithm and to prove its convergence. To the best of our knowledge, these properties of the secrecy capacity never appeared in the literature before, even without interference constraints. We will assume below that , i.e. is not identically 0 for all (which would be the case for a reversely-degraded channel).

###### Proposition 3.

Let . The secrecy capacity as a function of TPC power (under fixed ) has the following properties:

1. is a non-decreasing function of ; strictly-increasing for any .

2. is a concave, continuous function of .

3. If for some , then this holds for any . Equivalently, if , then for any , where is the right derivative; additionally, for any .

4. If for some , then for any . Equivalently, if , then for any , , where is the left derivative; additionally, for any , i.e. is strictly increasing for any .

###### Proof.

See Appendix ∎

Thus, is concave, non-decreasing, and strictly-increasing for . The rate of increase slows down with . Note that for , the capacity can be achieved with smaller Tx power . We will need below the following result to deal with the singular case.

###### Proposition 4.

Let be a Lagrange multiplier, as a function of TPC power , responsible for the TPC in (P1). Then, for any , i.e. the TPC is always active below , and for any .

###### Proof.

See Appendix. ∎

Based on this Proposition, we are now able to construct an iterative algorithm to evaluate optimal covariance in the singular case numerically with any desired accuracy. The key idea for the case (which is necessary for singularity) is to identify the saturation point and to apply Algorithm 1 with TPC power slightly less than (so that and hence the TPC is active thus avoiding the singularity in this way), which achieves the secrecy rate arbitrary close to the capacity as approaches from below, and gives a covariance matrix achieving this secrecy rate as well.

Algorithm 2 returns nearly-optimal covariance as well as its achieved secrecy rate and its distance to the secrecy capacity . In addition, the algorithm returns an approximate value of , i.e. the minimum Tx power required to achieve , as well as its accuracy . Note that and can be made as small as necessary by setting sufficiently small and (this follows from the continuity of all functions involved as well as the compactness of the feasible set for any finite , in addition to the nature of the bisection).

The condition in Line 4 is set to account for numerical imprecision effects in computing . While in theory one can set , this can result in numerical instability in practice in some cases. Typical values of range between (1% accuracy) to ; controls the accuracy of computed and corresponds to 1% accuracy with respect to .

### Iv-a Analysis of Algorithm 2

Here we provide a convergence analysis of Algorithm 2 to justify the claims above. To simplify the discussion, we consider first the case of and neglect the numerical imprecision effects (in particular, the imprecision of Algorithm 1, whose accuracy can be very high even for a small number of Newton steps), which is a standard assumption in the literature (see e.g. convergence analysis in [35]). The convergence of sufficiently small but non-zero will follow from the continuity of all functions involved.

Let , and be the power values set in Line 4 and 5 of Algorithm 2, i.e. at -th iteration of the bisection. Note that, due to the nature of the bisection, is reduced by a factor of 2 at each step, so that

 Δk=PT/2k (40)

The following proposition gives further important properties.

###### Proposition 5.

The following holds at -th iteration of the bisection in Algorithm 2 with :

 Pmin,k

and , are monotonically increasing and decreasing sequences, respectively. If , then

 Pmin,k

If , then

 Pmin,k=PT(1−2−k), Pmax,k=PT (43)
###### Proof.

To prove first two inequalities in (41), use and for any . The last inequality is by construction of the algorithm, i.e. from and , which follows from the fact that either or . Likewise, , since either or .

First inequality in (42) follows from Line 4 (with ), which implies that iff so that, from the monotonically-increasing property of in Proposition 3 and the initial condition , . second inequality in (42) is established in a similar way.

If , then , since implies , from the monotonically-increasing property of in Proposition 3, and the initial condition is . First equality in (43) follows from second and . ∎

Since as , it follows from Proposition 5 that

 Pmin,k, Pmax,k, Pk→min{PT,P0}=PT,min (44)

so that the minimum required power can be evaluated with any desired accuracy. Furthermore, the inaccuracy does not exceed and, since , the convergence is exponentially fast, so that very few steps are required in practice to achieve high accuracy. The number of steps needed to achieve the target accuracy is, from ,

 kδ=⌈log21δ⌉ (45)

Further note that Line 7 of Algorithm 2 evaluates under , where is the total number of bisections, so that under this condition (since, from Proposition 4, if ) and hence , i.e. is a maximizer of as well under the TPC power . From the continuity of , as , or equivalently, as , i.e. arbitrary high accuracy can be achieved in terms of the secrecy rate as well, with exponentially-fast convergence.

The case of non-zero can be considered in a similar albeit more technical way. Let be the inverse function of and

 P0ϵ=C−1((1−ϵ)C(P0)) (46)

so that . The same steps as in the proof of Proposition 5 can be used to establish (41)-(42) with in place of . Eq. (43) applies as long as , after which (42) applies with in place of . Note that , , and , as , so that similar accuracy and performance is expected for sufficiently small but non-zero .

## V Dual problems

Motivated by the energy efficiency issues (green communications, battery life etc.), one is lead to consider the following problem dual of (6), which is to minimize globally the total Tx power subject to the secrecy and interference power constraints:

 (P3)min(P,R)P  s.t.  (P,R)∈S3 (47)

where the feasible set is

 S3={(P,R): C(R) ≥C0, R≥0, tr(R)≤P, tr(W3jR)≤PIj} (48)

and is the target secrecy rate and is the secrecy constraint. Note that this problem is not convex in general, since is not concave (unless the channel is degraded), and, hence, powerful tools of convex optimization cannot be used to solve it numerically (i.e. to find a global optimum).

In addition to this problem, since an optimal covariance of the max problem in (6) is not necessarily unique (see e.g. Example 2 in [34]), a new problem emerges: among all optimal covariances , find one with the least trace (i.e. the minimum Tx power):

 (P4)minR∈S∗1tr(R) (49)

where is the set of all optimal covariances of the max problem in (6). Since an explicit characterization of this set is not known in the general case (it is not even known whether this set is convex), standard optimization tools (including convex optimization) seem to be inapplicable, making this problem difficult for a direct attack.

The following Proposition shows that these problems have identical solutions and that Algorithm 2 solves both of them.

###### Proposition 6.

Let and be the optimal values of (P3) and (P4), and let be the optimal value of (P1), i.e. . Then,

 P3=P4=min{PT,P0}, R∗3=R∗4, C(R∗3)=C(