I Introduction
For a given precoding support and penalty function , the Generalized Least Square Error (GLSE) precoder constructs the transmit vector from the data vector and the channel matrix as where is a power control factor and [1]
(1) 
The generality of and allows for addressing various forms of constraints on the transmit vector. Compared to the classical approaches for imposing such constraints, the studies in [2, 3, 1, 4] have shown significant enhancements obtained via the GLSE precoding scheme. Nevertheless, the computational complexity of this scheme has been remained as the main chal lenge and is intended to be addressed in this paper.
The main motivation of this study comes from the great deal of interest being received recently by massive MultipleInput MultipleOutput (MIMO) systems [5]. Form implementational points of view, however, these systems confront the problem of high Radio Frequency (RF)cost which raises due to the vast number of RFchains needed in such setups. The initial approach to overcome this issue is to restrict the PeaktoAverage Power Ratio (PAPR) of the transmit vector [6, 7]. In this case, nonlinear power amplifiers with lower dynamic ranges can be employed, and the total RFcost can be significantly reduced. Another approach is Transmit Antenna Selection (TAS) [8, 9] in which a subset of transmit antennas is kept active at each transmission interval, and therefore, the number of required RFchains is reduced. Although such approaches combat the issue of high RFcost, the conventional algorithms significantly degrade the performance. In this case, GLSE precoders reduce this degradation by finding the optimal transmit vector which satisfies the constraints imposed by these approaches. In general, GLSE precoders solve an optimization problem in each transmission interval. This task is not trivial for choices of and
which are nonconvex. For cases with convex optimization problems, the precoder can be implemented via generic linear programming algorithms. The high computational complexity of these algorithms for large dimensions, however, leaves the implementation of
GLSE precoders as an issue in massive MIMO setups. Generalized Approximate Message Passing (GAMP) [10]proposes a low complexity iterative approach for several estimation problems based on approximating the loopy belief propagation algorithm in the large limit
[11]. The algorithm is known to considerably outperform other available iterative approaches. The underlying estimation problems, which are addressed by GAMP, are mathematically similar to the GLSE precoding scheme, and therefore, the algorithm can be employed to design a class of iterative precoders based on the GLSE scheme.The main contribution of this paper is to adopt and tune the GAMP algorithm to address the GLSE precoding scheme, recently proposed in [2, 3, 1, 4]. The developed iterative scheme is referred to as “GLSEGAMP” precoding and exhibits low complexity characteristic. Using the fact that the GLSE and GLSEGAMP precoders consider same optimization problems, we further propose a tuning strategy based on the asymptotic results in [2, 3, 1, 4] derived via the replica method. Our numerical investigations show that the performance of GLSEGAMP precoders tuned by the proposed strategy is accurately consistent with asymptotics of corresponding GLSE precoders.
Notation
Throughout the paper, scalars, vectors and matrices are represented with nonbold, bold lower case and bold upper case letters, respectively. is a identity matrix, and is the Hermitian of . The set of real and integer numbers are denoted by and , and represents the complex plane. For , , and identify the real part, imaginary part and augmented vector, respectively, and the expression indicates that is the augmented version of . For , the gradient operator is defined as . and denote the Euclidean and
norm, respectively. Considering the random variable
,represents either the probability mass or density function. Moreover,
identifies the expectation. For sake of compactness, is abbreviated by , and we define and for a given nonnegative real .Ii Problem Formulation
Consider a Gaussian broadcast MIMO setup in which a sequence of data symbols for is transmitted to singleantenna users simultaneously. The transmitter is equipped with transmit antennas. The channel is considered to be quasistatic fading and perfectly known at the transmitter. By employing the GLSE precoding scheme given in (1) with some penalty and precoding support , the transmit vector is constructed as where and is a nonnegative power control factor. For this setup, we assume that the following constraints hold.

has independent and identically distributed (i.i.d.
) zeromean complex Gaussian entries with unit variance.

decouples meaning that .

and grow large, such that the load factor is kept fixed in both and .

in which is an unitary matrix, and
is a diagonal matrix with asymptotic eigenvalue distribution
. For , we define the Stieltjes transform as with the expectation being taken over and the transform as where denotes the inverse with respect to composition.
By proper choices of the support and penalty , the GLSE precoder can impose several constraints on the transmit vector.

Let and ; then, the number of active transmit antennas is constrained.
Iii GlseGamp Precoders
The GLSE scheme can be considered as a maxsum problem which can be addressed via the GAMP algorithm [10].
Iiia Gamp Algorithm
The GAMP algorithm, proposed in [10], intends to estimate from iteratively considering the following setup.

Each entry of is generated from the corresponding entry of some via .

The entries of are obtained form the entries of the vector through identical scalar channels with .
Depending on the estimation scheme, the GAMP algorithm is developed to address the “maxsum” or “sumproduct” problems. The maxsum GAMP algorithm iteratively determines the MaximumAPosterior (MAP) estimation
(2) 
for some scalar functions and which represent the conditional distributions and . The sumproduct GAMP algorithm, moreover, addresses the Minimum MeanSquareError (MMSE) estimation where .
IiiB The GampGlse Algorithm
By comparing GLSE precoding with (2), it is observed that the precoding scheme solves a maxsum problem in which with being the channel matrix, for , and and . As the result, the GAMP algorithm can be applied to iteratively construct the transmit vector . By some lines of derivations, the maxsum GAMP algorithm can be adopted to the GLSE scheme in (1). The resulting algorithm is referred to as “GLSEGAMP” algorithm and is represented in Algorithm 1 for the precoding support and the complexvalued matrix . The variables and functions in the algorithm, for and , are defined as follows.

The real twodimensional vectors , , , , and are the augmented forms of the complex scalars , , , , and , respectively.

The matrices , , and are real matrices, and is defined as
(3) with representing the entry of .

is the output thresholding function defined as
(4) where the function is determined by
(5) 
is the input thresholding function being defined as
(6) where the function is evaluated by
(7) 
The initial conditions are and .
The update rules in Algorithm 1 are derived by extending the summax GAMP algorithm to the case with a complexvalued matrix and an arbitrary input support . The extension is followed by determining the update rules for the corresponding loopy belief propagation algorithm and then taking some steps similar to [10, Appendix C]. The detailed derivations are skipped due to the page limit and is represented in the extended version of the manuscript.
Remark 1:
One should distinguish between the GLSE scheme and the GLSEGAMP algorithm. In fact, the former is a least square based scheme to design transmit signals which fulfill some desired constraints. The GLSEGAMP algorithm, on the other hand, proposes an iterative approach based on GAMP to address the GLSE scheme. For some choices of the penalty function, precoding support and channel matrix, the GLSEGAMP algorithm converges to the transmit signal given by the GLSE scheme. There are however some particular cases in which the GLSEGAMP algorithm does not converge. For these cases, Algorithm 1 does not give the desired transmit signal. To avoid the divergence in such cases, we need to modify the algorithm. This issue is briefly discussed in Section V.
In contrast to GLSE precoders, GLSEGAMP precoders exhibit low complexity characteristic. Considering Algorithm 1 and noting that the matrices in (8a)(9d) are fixed matrices, it is straightforward to show that the total worstcase complexity of GLSEGAMP precoders per iteration is . The number of iterations, moreover, does not grow with the dimensions. Therefore, one can conclude that the overall complexity of the precoding scheme is as well.
IiiC Tuning GlseGamp precoders
In order to impose a given set of constraint on the transmit signal, the corresponding GLSEGAMP precoder should be tuned. As an example, consider the case in which the number of active transmit antennas, as well as the average transmit power, is desired to be restricted via a GLSEGAMP precoder. In this case, one may set and . The factors and in this case control the average transmit power and the fraction of active antennas, respectively. Consequently for given constraints, these factors need to be tuned. Nevertheless, the derivation of an exact tuning strategy is not a trivial problem as the constrained parameters, i.e., the ave rage power or fraction of active antennas, cannot be derived in terms of the tuning factors straightforwardly. We therefore propose a tuning strategy based on the asymptotics of the GLSEGAMP algorithm and its connection to the GLSE scheme. The largesystem performance of GLSEGAMP precoders is studied through asymptotic analyses of “state evolution” equations; see [12] and the references therein. Following the results in the literature, e.g. [13, 14], it is shown that for choices of , and , in which the GLSEGAMP algorithm converges, the asymptotic performance of the algorithm coincides with the largesystem performance of GLSE precoders investigated in [1, 4]. This result indicates that in the largesystem limit, the tuning factors for GLSEGAMP and GLSE precoders are the same. Therefore, for a given set of constraint, we derive the tuning factors of the GLSEGAMP precoders by tuning the corresponding GLSE precoders.
Tuning Strategy:
Assume that the constraints are desired to be satisfied via a GLSEGAMP precoder with penalty and support which are controlled by for . Here, are decoupling functions meaning that . To tune accordingly, we define
(10) 
where with
(11) 
and for and which satisfy and
(12) 
The precoder is then accordingly tuned by choosing for such that the equations are satisfied.
Derivation:
The proposed tuning strategy evaluate the decoupled GLSE precoder^{1}^{1}1See Proposition 2 in [1] for the decoupling property of GLSE precoders. A more general version of the property is represented in [4, Section IIA]. by finding and form the fixedpoint equations. The asymptotic constrained parameters are then determined by taking the expectation and set it equal to
. One should note that the strategy in general is heuristic, since it tunes the precoders for the largesystem limit. Nevertheless, the numerical investigations show that for several cases, the
GLSEGAMP precoders are well tuned via this strategy.Iv Applications of GlseGamp Precoders
In this section, we investigate two special cases of GLSEGAMP precoders with TAS and limited PAPR. Throughout the analyses, we assume that represents an i.i.d. Rayleigh fading channel with variance , i.e., .
Iva GlseGamp Precoder with Tas
As it was discussed, TAS can be directly addressed at the transmit side by using GLSE scheme with . The corresponding GLSEGAMP precoder is therefore given by Algorithm 1 where , and , respectively with
(13a)  
(13b) 
and . For the input thresholding function, the analytic evaluation of the function from the augmented form in (6) is not trivial. We thus employ the complex scalar form of the equation which results in and where
(14) 
with and , and
(15a)  
(15b) 
By setting , the GLSE scheme reduces to Regularized Zero Forcing (RZF) precoding, and thus, the GLSEGAMP algorithm iteratively constructs the output of the RZF precoder.
Tuning Strategy
We employ the strategy in Section IIIC to tune and such that the fraction of active antennas and the average transmit power are and , respectively. For this case, and and . Consequently, and are determined from the fixedpoint equations for and
(16) 
and is determined in terms of and through
(17) 
IvB GlseGamp Precoder with Papr Constraint
The precoder in Section IVA can further take the PAPR constraint into account by setting . The support in this case imposes a peak power constraint on the transmit signal which along with the penalty function restricts both the PAPR and the number of active antennas^{2}^{2}2See [1, Section IVB] for further illustrations.. Considering Algorithm 1, the output function for this setup remains unchanged , and the input function reads
(18) 
with the corresponding gradient
(19) 
where , , and
(20) 
, and are moreover given as in Section IVA. By setting , the precoder employs all the transmit antennas and restricts only the PAPR. In this case, , , and reduces to zero.
Tuning Strategy
V Numerical Investigations
To investigate the performance of GLSEGAMP precoders, we define the distortion measure for a given as
(24) 
which determines the average distortion caused by the multiuser interference at receive terminals. It is moreover shown that the achievable ergodic rate per user can be bounded from below in terms of as proved in [2].
The circles in Fig. 1 show the distortion given by the GLSEGAMP precoder presented in Section IVA for various inverse load factors considering several constraints on the number of active antennas. The results have been given for antennas and iterations. The asymptotic performances of the corresponding GLSE precoders, derived via the replica method in [4], have been also sketched with solid lines. Here, and is set such that . As the figure shows, the GLSEGAMP precoder tracks accurately the performance of the GLSE scheme, even for a practically moderate number of antennas. For the PAPRlimited precoder in Section IVB, the distortion at has been plotted in terms of in Fig. 2. The curves have been sketched for multiple PAPR constraints. Similar to Fig. 1, solid lines correspond to the GLSE scheme and circles denote the simulation results for the GLSEGAMP precoder with and for PAPR dB. Here, we have considered , and is tuned via the proposed strategy assuming all the antennas being active. The figure depicts that by increasing the PAPR up to dB, the performance of the precoder is sufficiently close to the case without PAPR restriction. This observation suggests for employing the GLSEGAMP precoder, in order to reduce the transmit PAPR without any significant performance loss. In this case, low efficiency power amplifiers can be utilized which can significantly reduce the RFcost.
Remark 2:
It is known that the GAMP algorithm converges for i.i.d. Gaussian matrices [13, 14]. However, by deviating from this assumption, the algorithm may diverge. This issue was recently addressed in [15] via the Vector Approximate Message Passing (VAMP) algorithm. Consequently, for channel models with illconditioned matrices, one can develop a precoding algorithm based on the GLSE scheme by taking a same approach while employing VAMP.
Vi Conclusion
This paper has proposed a class of low complexity precoders based on the GLSE scheme using the GAMP algorithm. The numerical investigations have been consistent with the replica results for the GLSE scheme given in [2, 3, 1, 4]. This consistency demonstrates that various implementational limitations in massive MIMO systems can be effectively overcome using some lowcomplexity, but effective, algorithms. As indicated in Remark 2, the GLSEGAMP precoders may fail in converging for channel models with illconditioned channel matrices, and therefore, an alternative algorithm can be proposed via VAMP. The extension under VAMP is however skipped and left as a possible future work.
References
 [1] A. Bereyhi, M. A. Sedaghat, S. Asaad, and R. Müller, “Nonlinear pre coders for massive MIMO systems with general constraints,” International ITG Workshop on Smart Antennas (WSA), 2017.
 [2] M. A. Sedaghat, A. Bereyhi, and R. Müller, “Least Square Error Pre coders for Massive MIMO with Signal Constraints: Fundamental Limits,” IEEE Transactions on Wireless Communications, 2017.
 [3] M. A. Sedaghat, A. Bereyhi, and R. Müller, “A New Class of Nonlinear Precoders for Hardware Efficient Massive MIMO Systems,” International Conference on Communications (ICC), 2017.
 [4] A. Bereyhi, M. A. Sedaghat, and R. Müller, “Asymptotics of nonlinear LSE precoders with applications to transmit antenna selection,” IEEE International Symposium on Information Theory (ISIT), 2017.
 [5] J. Hoydis, S. Ten Brink, and M. Debbah, “Massive MIMO in the UL/DL of cellular networks: How many antennas do we need?” IEEE Journal on selected Areas in Communications, vol. 31, no. 2, pp. 160–171, 2013.
 [6] S. K. Mohammed and E. G. Larsson, “Perantenna constant envelope precoding for large multiuser MIMO systems,” IEEE Transactions on Communications, vol. 61, no. 3, pp. 1059–1071, 2013.
 [7] J.C. Chen, “Lowpapr precoding design for massive multiuser MIMO systems via Riemannian manifold optimization,” IEEE Communications Letters, vol. 21, no. 4, pp. 945–948, 2017.
 [8] H. Li, L. Song, and M. Debbah, “Energy efficiency of largescale multiple antenna systems with transmit antenna selection,” IEEE Transactions on Communications, vol. 62, no. 2, pp. 638–647, 2014.
 [9] S. Asaad, A. Bereyhi, R. R. Müller, and A. M. Rabiei, “Asymptotics of transmit antenna selection: Impact of multiple receive antennas,” International Conference on Communications (ICC), 2017.
 [10] S. Rangan, “Generalized approximate message passing for estimation with random linear mixing,” IEEE Int. Sym. on Inf. Theory (ISIT), 2011.
 [11] D. L. Donoho, A. Maleki, and A. Montanari, “Messagepassing algorithms for compressed sensing,” Proceedings of the National Academy of Sciences, vol. 106, no. 45, pp. 18 914–18 919, 2009.
 [12] A. Javanmard and A. Montanari, “State evolution for general approximate message passing algorithms, with applications to spatial coupling,” Information and Inference, p. iat004, 2013.
 [13] S. Rangan, P. Schniter, and A. Fletcher, “On the convergence of approximate message passing with arbitrary matrices,”IEEE International Symposium on Information Theory (ISIT), 2014.
 [14] S. Rangan, P. Schniter, E. Riegler, A. K. Fletcher, and V. Cevher, “Fixed points of generalized approximate message passing with arbitrary matrices,” IEEE Transactions on Information Theory, vol. 62, no. 12, pp. 7464–7474, 2016.
 [15] S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message passing,” IEEE International Symp. on Inf. Theory (ISIT), 2017.