Massive multiple-input multiple-output (MIMO), which deploys tens to hundreds of antennas at the base station (BS), is a key technology for significantly improving the spectrum and energy efficiency of 5G and beyond wireless communication systems . However, since the number of radio-frequency (RF) chains needs to be scaled up with the number of antennas, the hardware complexity and power consumption would be unaffordably high for practical massive MIMO systems if high-resolution analog-to-digital converters (ADCs)/digital-to-analog converters (DACs) are employed. To deal with such issues, there have been growing interest in the employment of low-resolution ADCs/DACs, especially the cheapest one-bit ones. In particular, the one-bit DAC downlink has attracted a lot of recent research interests –.
Early works [2, 3, 4] are based on linear-quantized precoding schemes, in which the one-bit precoders are obtained by directly quantizing the classical linear precoders. Despite the advantage of their low computational complexities, such linear precoders usually suffer from high symbol error floors. As such, there have been emerging works on analyzing and designing nonlinear precoders for one-bit downlink transmission. In [5, 6, 7], the authors have focused on the minimum mean square error (MMSE) criterion to formulate the one-bit precoding design problem, and the precoders proposed therein are shown to greatly enhance the performance of the linear precoders. Recently, the novel idea of constructive interference (CI) [14, 15]
has been incorporated into one-bit precoding design. There are also some works that directly consider the symbol error probability (SEP) criterion[8, 9]. In fact, the CI metric is shown to be closely related to the SEP criterion  and is easier to characterize, which motivates a new line of research focusing on the CI metric [11, 12, 13]. Specifically, in 
, the CI-based model for one-bit precoding has been formulated for the first time and a precoder based on linear programming (LP) relaxation named maximum safety margin (MSM) has been developed. Later, the authors in have proposed an alternative CI-based model, known as the symbol-scaling model, which admits a simpler formulation and is shown in  to be equivalent to the previous model.
The existing state-of-the-art algorithms [12, 13] for the CI metric are mainly based on the LP relaxation of the symbol scaling model. These algorithms generally consist of two stages: in the first stage, the LP relaxation model is solved; in the second stage, some techniques are applied to determine the values of elements of the LP solution that do not satisfy the one-bit constraint. Different techniques in the second stage lead to different algorithms. In particular, the partial branch-and-bound (P-BB) algorithm and the ordered partial sequential update (OPSU) algorithm proposed in  apply a BB procedure and a greedy procedure in the second stage, respectively. The CI-based approaches generally enjoy significantly better performance than the MMSE-based approaches. However, their performance degrades in large-scale systems with high-order modulation (e.g., OPSU) or their computational costs are prohibitively high (e.g., P-BB).
In this paper, we focus on the CI-based symbol scaling model for one-bit downlink transmission with phase shift keying (PSK) modulation. We propose an efficient negative penalty (NL1P) approach for solving the considered problem, which is especially efficient in the massive MIMO scenario where the problem dimension is large. More specifically, we first introduce a novel negative
penalty model, which shares the same global and local solutions with the original problem when the penalty parameter is sufficiently large. This is in sharp contrast to the LP relaxation model on which the existing approaches (e.g., MSM, OPSU, P-BB) are based. Then, we transform the penalty model into an equivalent min-max problem. By taking care of its special structure, we propose an efficient alternating optimization (AO) algorithm for solving the reformulated min-max problem, where at each iteration only two matrix-vector multiplications and one projection onto the simplex need to be computed, making it particularly suitable for solving large-scale problems. We also establish the global convergence of the AO algorithm. Simulation results show that our proposed algorithm achieves a better tradeoff between the bit-error-rate (BER) performance and the computational efficiency than the state-of-the-art CI-based algorithms.
2 problem formulation
2.1 System Model
Consider a downlink multiuser massive MIMO system in which a BS equipped with antennas transmits signals to single-antenna users simultaneously. The received signal vector is given by
where is the flat-fading channel matrix between the BS and the users, is the transmitted signal, and is the additive white Gaussian noise.
We consider the scenario where one-bit DACs are employed at the BS. In this case, each element of is drawn from a discrete set consisting of only four symbols. In particular, , where is the imaginary unit (satisfying ) and is normalized to be of unit norm. In this paper, we restrict our attention to PSK modulation, that is, all elements of the intended data symbol vector for the users are drawn from a unit-norm -PSK modulation. Our goal here is to design the transmitted signal such that the SEP is as low as possible.
2.2 Problem Formulation
We adopt the CI-based symbol scaling model to formulate our interested problem as in [12, 13]. The main idea is to maximize the minimum distance from all received noise-free signals to their corresponding decision boundaries.
Taking -PSK modulation as an example and assuming the intended data symbol for user is , we illustrate in Fig. 1 how to characterize the distance from the noise-free received signal (corresponding to ) to its decision boundary. In particular, we decompose along and , which are the unit vectors parallel to the two decision boundaries of , as
where we remove the problem-dependent quantity from the constraint on and incorporate it into . Since and are both real numbers, we can express explicitly as a function of , , and by rewriting the complex-valued constraints (1a) into the real-valued form. Moreover, the original maximization problem can be converted into a minimization problem (by adding a negative sign in the objective). Then we arrive at the following compact form:
where and with
The constraint in problem (2) can be further substituted into the objective, which leads to the following form:
where and is the -th row of . In the following, we shall design algorithms based on the compact form (P), which appears to be easier to handle than the form (P).
3 Proposed Negative Penalty Approach
Solving problems with a non-smooth objective and discrete constraints like (P) is generally challenging. In addition, the considered massive MIMO scenario leads to large-scale problems, which places high demand on the efficiency of the algorithm. In this section, we propose an efficient negative penalty approach for finding a high-quality solution of problem (P).
3.1 Exact Penalty Model for Problem (P)
To deal with the discrete one-bit constraint in (P), we resort to the penalty technique. More specifically, we transform problem (P) into the following negative penalty model111A closely related work  considers to first smooth the objective in (P) and then apply the negative square penalty, i.e., , to the smoothed problem. In contrast, our proposed penalty model deals with the original non-smooth objective, in which case the exact penalty property (see Theorem 1 further ahead) does not hold for the negative square penalty, and thus the non-smooth negative penalty is adopted in this paper.:
in which the discrete constraint is relaxed and a negative -norm term is included in the objective to encourage large magnitude of . In the following theorem, we establish the equivalence between the penalty problem (P) and the original problem (P).
Theorem 1 (Exactness of Penalty Model (P))
If the penalty parameter in (P) satisfies , then the following results hold:
Any optimal solution of (P) is also an optimal solution of (P), and vice versa.
Any local minimizer of (P) is a feasible point of (P); on the other hand, any feasible point of (P) is also a local minimizer of (P).
The above theorem reveals that problem (P) is an exact reformulation of problem (P) in the sense that the two problems share the same global and local solutions. This motivates us to solve the discrete problem (P) by solving the continuous problem (P).
3.2 Min-Max Reformulation of the Penalty Model
Problem (P) is still challenging to solve due to its non-smooth and non-convex objective. To tackle it, we introduce an auxiliary variable to reformulate problem (P) as the following min-max problem:
It is shown in  that the two problems (P) and () are equivalent. In particular, an optimal solution (stationary point) of one problem can be easily constructed given an optimal solution (stationary point) of the other problem. In the following, we design an efficient algorithm for problem () by exploiting its special structure.
3.3 Alternating Optimization Algorithm for ()
Note that if the variable in problem () is fixed, then the objective is separable in . Based on this, we consider to update and in an alternating fashion.
Our proposed algorithm can be regarded as an extension of the algorithms proposed in  and , which are designed for smooth min-max problems and thus cannot be applied directly to our interested problem (). Similar to  and , we consider a perturbed function:
where the perturbed term is introduced to make strongly concave in . It is shown in  and  that the perturbed term is important for the convergence of the corresponding algorithms. At each iteration, our proposed algorithm performs the following updates:
where and are the parameters that need to be selected carefully (and the choices of these parameters will be specified later in Theorem 2). Since the above algorithm updates and alternately, we name it as the alternating optimization (AO) algorithm.
The update of variable is a normal step which minimizes the current objective plus a regularization term. It is easy to check that the -subproblem (3a) admits a closed-form solution as
where , denotes the -th column of , and returns the sign of the corresponding real number. The update of variable is a projection gradient step for the perturbed function. The solution of the -subproblem (3b) involves only one matrix-vector multiplication and one projection onto the simplex, which has a very fast implementation . Therefore, the proposed AO algorithm enables us to solve problem () very efficiently. We summarize the AO algorithm for solving problem () in Algorithm 1.
Next we present the convergence results of the proposed AO Algorithm (without the proof due to the space reason). In particular, the following theorem shows that when the penalty parameter is sufficiently large and the algorithm parameters are properly selected, every limit point of the sequence generated by Algorithm 1 is a local minimizer of problem (P), and more importantly, it satisfies the one-bit constraint. This desirable property is a combination of nice properties of the penalty model (P) and Algorithm 1.
Let be the sequence generated by Algorithm 1 with , and , where , , , , and . Then every limit point of is a stationary point of problem (). Moreover, if is a local minimizer of problem (P) and satisfies the one-bit constraint.
3.4 Negative Penalty Approach for Problem (P)
Theorems 1 and 2 inspire us to find a high-quality solution of problem (P) by applying Algorithm 1 to solve problem () (equivalent to problem (P)) with a sufficiently large penalty parameter . To further improve the numerical performance, we employ the homotopy technique , i.e., we initialize the penalty parameter with a small value at the beginning, then gradually increase it and trace the solution path of the corresponding penalty problems, until the penalty parameter is sufficiently large and a one-bit solution is obtained. We name the whole procedure for solving problem (P) as the negative penalty (NL1P) approach and summarize it as follows.
4 Simulation Results
In this section, we present simulation results to show both the effectiveness and the efficiency of our proposed NL1P approach. We consider multiuser massive MIMO systems where the BS is equipped with hundreds of antennas. The transmission block length is set to be and the SNR is defined as , where the unit transmit power is assumed. The channel matrix
is composed of independent and identically distributed Gaussian random variables with zero mean and unit variance. All the results are obtained with Monte Carlo simulations of 1000 independent channel realizations.
The parameters used in our algorithms are as follows. In Algorithm 2, the initial point is chosen as ; the penalty parameter is initialized as and increased by a factor of at each iteration. In Algorithm 1, we set the initial point of as , where is the all-one vector, and the other parameters as and . We terminate Algorithm 1 for solving the subproblem () in Algorithm 2 when its iteration number is more than or when the distance of its successive iterates is less than .
We compare the proposed NL1P approach with the following algorithms: zero-forcing (ZF) with infinite-resolution DACs, termed as ‘Inf-Bit ZF’, which serves as the performance limit of all one-bit precoders; ZF followed by one-bit quantization , termed as ‘1-Bit ZF’; SQUID  which is an algorithm based on the MMSE metric, termed as ‘MMSE 1-Bit SQUID’; the MSM precoder  based on the CI metric obtained by quantizing the LP relaxation solution, termed as ‘CI 1-Bit MSM’; OPSU and P-BB  based on the CI metric, termed as ‘CI 1-Bit OPSU’ and ‘CI 1-Bit P-BB’, respectively.
In Figs. 2 and 3, we present the BER results for different massive MIMO systems. Specifically, in Fig. 2 we consider a system with -PSK modulation and in Fig. 3 we consider a system with -PSK modulation. The P-BB approach is not included in Fig. 2 due to its prohibitively high complexity. As shown in the figures, the one-bit ZF precoder suffers a sever BER floor due to its coarse one-bit quantization, while all of the nonlinear approaches exhibit significantly better BER performance. Nevertheless, the MMSE-based SQUID approach and the CI-based MSM approach also saturate early in the high SNR regime. Of the precoders that offer satisfactory performance, the proposed approach exhibits better error-rate performance than the state-of-the-art OPSU precoder. In particular, we can observe an SNR gain up to nearly dB and dB in Fig. 2 and Fig. 3 respectively when the BER is ; as the BER becomes lower, the performance gain in terms of the SNR also becomes larger. The P-BB algorithm, though with slightly better performance than our proposed algorithm, is much more computationally inefficient, as will be demonstrated in Fig. 4.
In Fig. 4, we evaluate the efficiency of the compared algorithms by reporting their CPU time. Among all the compared CI-based precoders, our proposed approach is the most efficient. More specifically, the computational costs of the MSM precoder and the OPSU precoder increase rapidly with the scale of the system, while that of our proposed approach grows much slower. This is because both of the MSM and OPSU algorithms solve the LP relaxation model via the interior-point method, whose complexity is high when the problem dimension is large, while the proposed NL1P approach solves the penalty model (P with the AO algorithm, which enjoys low per-iteration complexity. As shown in the figure, the P-BB algorithm is much more computationally expensive than all the other methods. Its computational cost becomes prohibitively high when the number of users is large, since the complexity of the branch and bound procedure grows exponentially with respect to the number of users . This makes the P-BB approach unsuitable for practical implementation and can only serve as a performance benchmark.
From the simulation results, we can conclude that our proposed NL1P approach achieves a better tradeoff between the BER performance and the computational efficiency than the state-of-the-art CI-based algorithms. The good BER performance is mainly attributed to the exactness of the negative penalty model and the high computational efficiency is due to the efficiency of the AO algorithm for solving the penalty model (P) in the proposed NL1P approach.
-  J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
-  A. K. Saxena, I. Fijalkow, and A. L. Swindlehurst, “Analysis of one-bit quantized precoding for the multiuser massive MIMO downlink,” IEEE Trans. Signal Process., vol. 65, no. 17, pp. 4624–4634, Sept. 2017.
-  A. Mezghani, R. Ghiat, and J. A. Nossek, “Transmit processing with low resolution D/A-converters,” in Proc. 16th IEEE Int. Conf. Electron., Circuits Syst., Dec. 2009, pp. 683–686.
-  O. B. Usman, H. Jedda, A. Mezghani, and J. A. Nossek, “MMSE precoder for massive MIMO using 1-bits quantization,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Mar. 2016, pp. 3381–3385.
-  S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and C. Studer, “Quantized precoding for massive MU-MIMO,” IEEE Trans. Commun., vol. 65, no. 11, pp. 4670–4684, Nov. 2017.
-  O. Castañeda, T. Goldstein, and C. Studer, “POKEMON: A non-linear beamforming algorithm for 1-bit massive MIMO,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Mar. 2017, pp. 3464–3468.
-  O. Castañeda, S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and C. Studer, “1-bit massive MU-MIMO precoding in VLSI,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4, pp. 508–522, Dec. 2017.
-  M. Shao, Q. Li, W.-K. Ma, and A. M.-C. So, “A framework for one-bit and constant-envelope precoding over multiuser massive MISO channels,” IEEE Trans. Signal Process., vol. 67, no. 20, pp. 5309–5324, Oct. 2019.
-  F. Sohrabi, Y.-F. Liu, and W. Yu, “One-bit precoding and constellation range design for massive MIMO with QAM signaling,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 3, pp. 557–570, Jun. 2018.
-  M. Shao, Q. Li, Y. Liu, and W.-K. Ma, “Multiuser one-bit massive MIMO precoding under MPSK signaling,” in Proc. IEEE Global Conf. Signal Inf. Process., Nov. 2018, pp. 833–837.
-  H. Jedda, A. Mezghani, J. A. Nossek, and A. L. Swindlehurst, “Massive MIMO downlink 1-bit precoding with linear programming for PSK signaling,” in Proc. IEEE Workshop Signal Process. Adv. Wireless Commun., Jul. 2017, pp. 1–5.
-  A. Li, C. Masouros, F. Liu, and A. L. Swindlehurst, “Massive MIMO 1-bit DAC transmission: A low-complexity symbol scaling approach,” IEEE Trans. Wireless Commun., vol. 17, no. 11, pp. 7559–7575, Nov. 2018.
-  A. Li, F. Liu, C. Masouros, Y. Li, and B. Vucetic, “Interference exploitation 1-bit massive MIMO precoding: A partial branch-and-bound solution with near-optimal performance,” IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3474–3489, May 2020.
-  C. Masouros and G. Zheng, “Exploiting known interference as green signal power for downlink beamforming optimization,” IEEE Trans. Signal Process., vol. 63, no. 14, pp. 3628–3640, Jul. 2015.
-  A. Li, D. Spano, J. Krivochiza, S. Domouchtsidis, C. G. Tsinos, C. Masouros, S. Chatzinotas, Y. Li, B. Vucetic, and B. Ottersten, “A tutorial on interference exploitation via symbol-level precoding: Overview, state-of-the-art and future directions,” IEEE Commun. Surveys Tuts., vol. 22, no. 2, pp. 796–839, 2nd Quart. 2020.
-  A. Li, C. Masouros, B. Vucetic, Y. Li, and A. L. Swindlehurst, “Interference exploitation precoding for multi-level modulations: Closed-form solutions,” IEEE Trans. Commun., vol. 69, no. 1, pp. 291–308, Jan. 2021.
-  S. Lu, I. Tsaknakis, M. Hong, and Y. Chen, “Hybrid block successive approximation for one-sided non-convex min-max problems: Algorithms and applications,” IEEE Trans. Signal Process., vol. 68, pp. 3676–3691, Apr. 2020.
-  Z. Xu, H. Zhang, Y. Xu, and G. Lan, “A unified single-loop alternating gradient projection algorithm for nonconvex-concave and convex-nonconcave minimax problems,” 2020. [Online]. Available: https://arxiv.org/abs/2006.02032
-  L. Condat, “Fast projection onto the simplex and the ball,” Math. Program., vol. 158, no. 1, pp. 575–585, 2016.
-  M. Shao and W.-K. Ma, “Binary MIMO detection via homotopy optimization and its deep adaptation,” IEEE Trans. Signal Process., vol. 69, pp. 781–796, Feb. 2021.