The preliminary studies on multiple-input multiple-output (MIMO) capacity showed that single-user MIMO capacity can be achieved by Gaussian input signaling [1, 2]. In , Weingarten et al. further proved that the entire capacity region of a Gaussian MIMO broadcast channel (BC) is achievable through optimal dirty paper coding (DPC) method. In reality, the optimal precoding method is not only high-complexity but also difficult to implement. As a result, there are growing interests in suboptimal techniques such as zero-forcing (ZF) and successive zero-forcing dirty-paper coding (SZFDPC) [4, 5, 6, 7, 8, 9].
Traditionally, the majority of the research on MIMO capacity assumes a sum power constraint (SPC) for which efficient algorithms can be derived [1, 2, 10, 11, 12]. Nevertheless, solutions to SPC problems may result in nonlinear distortions if the allocated power is beyond the power limit of one or several power amplifiers. Towards this end, per-antenna power constraint (PAPC) is more realistic and of particular interest[13, 14, 15, 16] .
To the best of authors’ knowledge, only Tran et al. characterized the achievable rate region of SZFDPC under PAPC by customized interior-point method [9, 8]. However, this second-order-based method is not attractive to large-scale MIMO systems due to high complexity. In this study, we propose two algorithms to obtain sum rates of MIMO systems under PAPC and SZFDPC. The first algorithm achieves the optimal solution by alternating optimization (AO) while the second one exploits machine learning (ML) to arrive at a suboptimal one. In particular, our contributions include the following:
A novel AO-based algorithm is proposed to obtain the optimal solution. Specifically, the original maximization in the BC is transformed into an equivalent minimax problem in the multiple access channel (MAC), then an efficient iterative algorithm is derived based on successive convex approximation (SCA).
In case the optimal approach is feasible but inefficient, an ML-based approach, which trades off the complexity and the optimal solution, is a good alternative. In fact, our ML approach relies on linear regression and thus is appealing to applications such as massive MIMO.
We report for the first time the comparison of SZFDPC and others precoding methods under PAPC. Moreover, our ML-based approach can be applicable to similar capacity-related problems.
The remainder of the paper is organized as follows. A description of the system model is in Section II. The methods of computing the sum rate for SZFDPC are described in Section III followed by the numerical results in Section IV. Finally, we conclude the paper in Section V.
: Standard notations are used in this paper. Bold lower and upper case letters represent vectors and matrices, respectively.
defines an identity matrix, of which the size can be easily inferred from the context;denotes the space of complex matrices; and are Hermitian and normal transpose of , respectively; is the th entry of ; is the determinant of ; stands for a basis of the null space of ; denotes the absolute value of ; , where is a square matrix, returns the vector of diagonal elements of . Furthermore, we denote the Euclidean norm by and .
Ii System Model
Consider a MIMO BC consisting of a base station (BS) and users. The BS and each user are equipped with and antennas respectively. The channel matrix for user is denoted by . Normally, a user suffers interference from all other users in the system. For user in the SZFDPC scheme, the interference caused by users is cancelled by DPC, while that caused by users is nulled out by zero-forcing technique. In this way, a MIMO BC can be decomposed into parallel interference-free channels. We refer the interested reader to  and references therein for a more detailed description of the SZFDPC scheme.
The sum rate of SZFDPC can be characterized through solving the sum rate (SRMax) problem under PAPC which is formulated as
where is the input covariance matrix for user . The constraint in (1b) is imposed to suppress the interference from users as mentioned above.
Due to the use of zero-forcing method, SZFDPC is a suboptimal transmission strategy compared to DPC. However, SZFDPC does not cancel multiuser inference only by zero-forcing technique since DPC is still invoked for this purpose. Thus, SZFDPC can achieve a performance close to that of DPC, which was reported in various previous studies [17, 8, 9]. We note that for SZFDPC (i.e. (1)) to be feasible, it should hold that which is assumed in this paper. This dimension condition basically imposes a constraint on the maximum number of users that can be supported simultaneously. When the number of demanding users increases, a user scheduling algorithm is required and this problem was studied in 
where several efficient user selection methods were proposed for SZFDPC. We also remark that the interference cancelling process is performed sequentially after each user, and thus user ordering in SZFDPC is important. Optimal user ordering requires solving a combinatorial optimization problem but efficient user order algorithms were also proposed in. In this paper we simply assume the natural user ordering for SZFDPC and focus on the precoder design.
Iii Proposed Algorithms
Iii-a Alternating Optimization
Inspired by the work in , we extend the AO approach to our considered problem. More specifically, by extending Theorem 2 of , we can show that (2) can be equivalently transformed into the following minimax problem in the dual MAC
are obtained from the singular value decomposition of. In light of AO algorithm in [7, 15, 16], we first fix and consider the following problem
Problem (5) is the one of finding the capacity of parallel interference-free MIMO channels under a sum power constraint, which can be solved efficiently by the classical water-filling algorithm.
We now consider the problem of finding for given . To this end, we apply the following inequality:
where , and stands for . In the th iteration of the proposed algorithm to solve (3), is the solution to the following problem
Since is diagonal, (7) indeed reduces to
It’s worth noting that the feasible set of (8) is in fact a simplex. As shown shortly, projection onto a simplex can be done efficiently by closed-form expressions and this motivates us to solve (8) by a gradient projection (GP) method, which is outlined in Algorithm 1.
In Algorithm 1 the subscript denotes the iteration index and is the gradient of the objective at iteration computed as in line 1. Projection of onto the feasible set of (8) is equivalent to solving the following problem
This optimization problem can be solved efficiently by a water-filling-like algorithm. Specifically, the partial Lagrangian function of (9) is written as
For a given , it is easy to see that the optimal solution to is given by . The optimal such that can be simply found by bisection method. Note that when the PAPC is the same for all antennas, i.e., , for all , the feasible set becomes a canonical simplex for which more efficient algorithms for projection are available . The overall algorithm to solve the SRMax problem with SZFDPC and PAPC is summarized in Algorithm 2. The convergence of Algorithm 2 can be proved similarly to [7, Appendix B] and thus is skipped for the sake of brevity.
Iii-B A Feature Design-based Approach
Following similar arguments in , the interior-point-based approach to solve the considered problem has the complexity up to while the total per-iteration complexity of Algorithm 2 is flops. On the one hand, Algorithm 2 dominates the existing approach and reduces the complexity significantly, but on the other hand, it still experiences high complexity in case of massive MIMO settings where . In such cases, we can employ the following ML approach to obtain a suboptimal solution since this approach can adapt quickly to any changes in the systems while retaining the satisfactory performance.
Assuming that we execute Algorithm 2 to generate optimal sum rates based on inputs where is the number of samples. Note that contain features of the power constraints and channel coefficients. If we simply apply arbitrary ML algorithms, the errors will be extremely prohibitive due to the fact that the considered problem is nonlinear in nature with respect to either power constraint or the channel matrix (c.f. Fig. 3). On the other hand, nonlinear ML algorithms are much more difficult to investigate since there are no available solutions to this type of optimization. Even the optimal solution mentioned above already contains many nonlinear terms. Here, we propose a novel two-step preprocessing method to transform the inputs into another feature space so that linear regression algorithms are applicable. Herein, we will refer to this approach as feature design (FD) based approach. Step 1: Select a set of features by customizing the principle component analysis (PCA)-based algorithm in  largest contribution Step 2: Transform into higher feature space by .
Note that instead of choosing a number of largest eigenvalues of the covariance matrix randomly, we choose eigenvalues which are larger than 1. As a result, we can form a new matrix corresponding to those eigenvalues. To select the most dominant features, we calculate the contribution measure
where . Then we select the desired features with respect to largest contribution . Again, we avoid random selection of whose appropriate value is not easy to justify in practice. Instead, we propose to choose based on the matrix size and the number of users:
where and are the number of transmit antennas and the number of users, respectively; where is the number of receive antennas. Note that from (12) and we can therefore obtain a new matrix with reduced dimension .
In fact there are no criteria to choose a function to transform the inputs into another space where efficient algorithms can be derived. In our approach, we rely on the characteristics of the problem to propose a transform function. Specifically, recall that the considered sum rate is a function, thus we can transform these features into new space features where linear model are possible using the following
where is the base of the logarithm. Under this assumption, an output is given by
Iv Numerical Results
In this section, we numerically evaluate the performance of the proposed algorithms. For all iterative algorithms of comparison, we set an error tolerance of as the stopping criterion. The initial value in the corresponding proposed algorithm is set to the identity matrix for all simulations and the total transmit power is simply set to dBW, if not mentioned otherwise. The number of receive antennas is fixed to and and the power limit for all antennas is equal to . Other simulation parameters are specified for each setup. The codes are executed on a 64-bit desktop that supports 8 Gbyte RAM and Intel CORE i7.
In the first experiment, we compare the average sum rate of different precoding methods i.e., SZFDPC, ZF, and DPC with PAPC over a large number of channel realizations. Under large-scale MIMO setup (), three methods obtain the identical value since the correlation between channels approaches to zero. However, under normal MIMO settings, there is still a big gap between the capacity for ZF and that of SZFDPC and DPC. In general, SZFDPC always achieves a near-capacity rate.
Since Algorithm 2 and other benchmark scheme of consideration all generate the optimal solution to the corresponding problem, we mainly compare their complexity. In particular, we compare the runtime of Algorithm 2 with the interior-point method proposed in . Fig. 2 plots the average runtime as a function of the number of transmit antennas for finding the maximum sum rate of SZFDPC. We observe that Algorithm 2 performs water-filling and GP to find , which results in lower computation time compared with  which uses the barrier interior-point method. In general, the barrier method and other second-order optimization methods are known to achieve a superlinear rate but its per-iteration cost increases quickly with the problem size. This is actually consistent with what is shown in Fig. 2 for the barrier-method .
As can be seen from Fig. 2, our AO-based algorithm, though has low complexity, may take a few seconds or more to execute when we increase the problem size. Thus, a method of lower complexity which can strike a balance between the sum rate and complexity is of interest. In the following, we will investigate the performance of our ML-based approach to such scenarios. The PAPC ratio is chosen randomly, whereas SNR is chosen from the set dBW. For each MIMO setting we generated 240 samples. Also, we simply use natural log to transform the feature space in (13).
In Fig. 3
, we compare the optimal and estimate sum rates of a MIMO system with linear and nonlinear regression methods. More specifically, we utilize support vector regression (SVR) with radial basis function (RBF) kernel for nonlinear regression. Here, we train on 216 samples and test on 24 samples. As can be seen from the figure, without proper preprocessing methods both OLS and SVR fail to fit the data due to nonlinear nature of the problem. However, the results of the simple OLS, which take feature design into account, are very close to optimal solutions. The performance has also proved the feasibility of our approach.
In the last experiment, we consider the effectiveness of our approach in terms of average relative root mean square error (aRRMSE)  over large samples with varying number of transmit and receive antennas. In particular, we obtain the aRRMSE by executing 10-fold cross-validation using three simple linear ML algorithms: OLS, Ridge and PCR. According to , a learning model is considered good and excellent when and , respectively. Interestingly, the ML-based method shows a sufficiently low error rate, especially when . From our observations, the training matrices are invertible and the eigenvalues are larger than 1, thus the performance of OLS and PCR is the same and has minor difference in comparison with that of ridge regression. Unsurprisingly, this observation coincides with the properties of these regression methods.
We have proposed two low-complexity methods to compute sum rates of MIMO systems under PAPC and SZFDPC. The experiments using the optimal approach have stated that the SZFDPC can obtain near-capacity rates whereas the ZF scheme still operates far from the optimal capacity boundary for a specified number of users. The suboptimal ML-based method is more advantageous in case of large-scale MIMO settings. Extensive numerical results have demonstrated the superiority of the proposed algorithms over the existing interior-point method. More importantly, our ML-based approach can be applicable to a class of similar problems.
-  E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans. Telecommun, vol. 10, pp. 585–598, Nov. 1999.
-  G. J. Foschini and M. J. Gans, “On limits of wireless communications in a fading environment when using multiple antennas,” Wireless Pers.Commun, vol. 6, pp. 311–335, Mar. 1998.
-  H. Weingarten, Y. Steinberg, and S. S. Shamai, “The capacity region of the Gaussian multiple-input multiple-output broadcast channel,” IEEE Trans. Inf. Theory, vol. 52, no. 9, pp. 3936 – 3964, Sep. 2006.
-  G. Caire and S. Shamai, “On the achievable throughput of a multiantenna Gaussian broadcast channel,” IEEE Trans. Inf. Theory, vol. 49, no. 7, pp. 1691–1706, Jul. 2003.
-  A. Dabbagh and D. Love, “Precoding for multiple antenna Gaussian broadcast channels with successive zero-forcing,” IEEE Trans. Signal Process., vol. 55, no. 7, pp. 3837–3850, Jul. 2007.
-  Q. Spencer, A. Swindlehurst, and M. Haardt, “Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels,” IEEE Trans. Signal Process., vol. 52, no. 2, pp. 461–471, Feb. 2004.
-  T. M. Pham, and R. Farrell, and J. Dooley, and E. Dutkiewicz, and D. N. Nguyen, and L.-N. Tran, “Efficient zero-forcing precoder design for weighted sum-rate maximization with per-antenna power constraint,” IEEE Trans. Veh. Technol., vol. 67, no. 4, pp. 3640–3645, Apr. 2018.
-  L.-N. Tran, M. Juntti, M. Bengtsson, and B. Ottersten, “Weighted sum rate maximization for MIMO broadcast channels using dirty paper coding and zero-forcing methods,” IEEE Trans. Commun., vol. 61, no. 6, pp. 2362–2373, Jun. 2013.
-  ——, “Beamformer designs for MISO broadcast channels with zero-forcing dirty paper coding,” IEEE Trans. Wireless Commun., vol. 12, no. 3, pp. 1173–1185, Mar. 2013.
-  W. Yu, W. Rhee, S. Boyd, and J. Cioffi, “Iterative water-filling for Gaussian vector multiple-access channels,” IEEE Trans. Inf. Theory, vol. 50, no. 1, pp. 145–152, Jan. 2004.
-  N. Jindal, W. Rhee, S. Vishwanath, S. Jafar, and A. Goldsmith, “Sum power iterative water-filling for multi-antenna Gaussian broadcast channels,” IEEE Trans. Inf. Theory, vol. 51, no. 4, pp. 1570–1580, Apr. 2005.
-  J. Liu, Y. T. Hou, and H. D. Sherali, “On the maximum weighted sum-rate of MIMO Gaussian broadcast channels,” in Proc. IEEE ICC, May 2008, pp. 3664 – 3668.
-  M. Vu, “MIMO capacity with per-antenna power constraint,” in Proc. IEEE GLOBECOM, Dec. 2011, pp. 1 – 5.
-  W. Yu and T. Lan, “Transmitter optimization for the multi-antenna downlink with per-antenna power constraints,” IEEE Trans. Signal Process., vol. 55, no. 6, pp. 2646–2660, Jun. 2007.
-  T. M. Pham, and R. Farrell, and L.-N. Tran, “Low-complexity approaches for MIMO capacity with per-antenna power constraint,” in Proc. IEEE VTC-Spring, Jun. 2017, pp. 1–7.
-  ——, “Alternating optimization for capacity region of Gaussian MIMO broadcast channels with per-antenna power constraint,” in Proc. IEEE VTC-Spring, Jun. 2017, pp. 1–6.
-  L.-N. Tran and E.-K. Hong, “Multiuser diversity for successive zero-forcing dirty paper coding: Greedy scheduling algorithms and asymptotic performance analysis,” IEEE Trans. Signal Process., vol. 58, no. 6, pp. 3411–3416, Jun. 2010.
-  S. Vishwanath, N. Jindal, and A. Goldsmith, “Duality, achievable rates and sum-rate capacity of Gaussian MIMO broadcast channels,” IEEE Trans. Inf. Theory, vol. 49, pp. 2658–2668, Oct. 2003.
-  L. Armijo, “Minimization of functions having Lipschitz continuous first partial derivatives.” Pacific J. Math., vol. 16, no. 1, pp. 1 – 3, 1966.
-  L. Condat, “Fast projection onto the simplex and the ball,” Mathematical Programming, Series A, vol. 158, no. 1, pp. 575 – 585, Jul. 2016.
-  S. Theodoridis, Machine Learning: A Bayesian and Optimization Perspective, 1st ed. Orlando, FL, USA: Academic Press, Inc., 2015.
-  A. J. Smola and B. Schölkopf, “A tutorial on support vector regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–222, Aug. 2004.
-  M.-F. Li, X.-P. Tang, W. Wu, and H.-B. Liu, “General models for estimating daily global solar radiation for different solar radiation zones in mainland china,” Energy Conversion and Management, vol. 70, pp. 139 – 148, 2013.