I Introduction
Massive multipleinput multipleoutput (MIMO) systems rely on a large number of base station (BS) antennas for simultaneously serving a few dozens of users, while striking an attractive spectral efficiency (SE) vs. energy efficiency (EE) tradeoff [1, 2, 3]. However, a large number of antennas inevitably lead to an excessive radiofrequency (RF) hardware cost and energy consumption. Hence, a hardwareefficient hybrid analogdigital (HAD) beamforming architecture has been proposed as an alternative to the fully digital beamformer for practical implementation. Explicitly, this architecture relies on an analog beamformer in the RF domain combined with a lowdimensional digital beamformer in the baseband, hence allowing a significant reduction in the number of RF chains required. This in turn provides increased design flexibility for striking an attractive performance vs. complexity tradeoff. The receiver’s power dissipation is dominated by that of the analogtodigital converters (ADCs), since it increases exponentially with the number of quantization bits [4],[5]. Hence, lowresolution ADCs (LADCs) have been advocated for the RF chains of HAD receivers.
Ia Related Work
Extensive research efforts have been invested in the design and performance evaluation of hybrid beamformers [7, 8, 9, 10, 11, 12, 13, 14]. Basically, the existing hybrid beamforming architectures can be mainly divided into the static partiallyconnected structure [8, 9, 10], the static fullyconnected structure [8, 9, 11], and the dynamic partiallyconnected or fullyconnected structure [12, 13, 14]. The authors of [8] considered a singleuser millimeter wave (mmWave) MIMO system and treated the hybrid beamforming weight design as a matrix factorization problem. Efficient alternating optimization algorithms were developed for both static partiallyconnected and static fullyconnected structures. As a further exploration, the authors of [9]
proposed heuristic algorithms for the design of static partiallyconnected and fullyconnected HAD beamformers, permitting to maximize the overall SE of a broadband orthogonal frequencydivision multiplexing (OFDM)based system. Besides, the proposed algorithm in
[9] for the fullyconnected structure can achieve SE close to that of the optimal fullydigital solution with much less number of RF chains. Moreover, to dynamically adapt to the spatial channel covariance matrix and improve the system performance, [12, 13, 14] proposed to design the dynamicconnected HAD beamforming architecture to realize a flexible analog beamforming matrix. In [12], a dynamic subarray approach was considered for OFDM systems, and a greedy algorithm to optimize the array partition based on the longterm channel characteristics was suggested. Different from the partiallyadaptiveconnected structure in [12], the authors in [13] proposed to implement the hybrid precoder with a fullyadaptiveconnected structure. A joint optimization of switchcontrolled connections and the hybrid precoders was formulated as a largescale mixedinteger nonconvex problem with high dimensional power constraints. By modifying the onoff states of switchcontrolled connections, this fullyadaptiveconnected structure can realize a fullyconnected structure or any possible subconnected structure. Furthermore, the HAD beamforming strategy has also been investigated in the context of novel relayaided systems [15, 16] and in Terahertz communications [17]. Since employing LADCs in the massive MIMO regime has become indispensable for reducing the power consumption and hardware cost, it has catalyzed substantial interest in the recent literature. The authors of [18]analyzed the performance for transmission over flatfading MIMO channel using singlebit ADC and derived the capacity upperbound both at infinite and finite signaltonoise ratios (SNR). The impact of the spatial correlation of antennas on the rate loss caused by the coarse quantization of LADCs was further studied in
[19], where the authors concluded that LADCs can achieve a sum rate performance much closer to the case of ideal ADCs under spatially correlated channels.On account of the benefits provided by HAD beamformers and LADCs, a number of studies have been proposed to characterize the performance of massive MIMO systems relying on the HAD beamforming architecture using LADCs. The pioneering contribution of [20] proposed a generalized hybrid architecture using LADCs and verified that the achievable rate is comparable to that obtained by highprecision ADC based receivers at low and medium SNRs, which provides valuable insights for future research. Intensive research efforts have also been dedicated to analog/digital beamforming design [21, 22], to SE/EE optimization [23, 24, 25], to channel estimation [26] and to signal detection [27].
However, previous research on HAD beamforming using LADCs has mainly considered uniform quantizers having a fixed, predetermined number of bits, which limited the performance of these systems due to coarse quantization. As a further advance, it was shown that a variableresolution ADC or adaptiveresolution ADC (RADC) architecture is preferable [28, 29]. In [28], the authors investigated a mixedADC structure designed for cloud radio access networks (CRAN). In particular, they developed an ADCresolution selection algorithm for maximizing either the SE or EE based on an approximation of the generalized mutual information in the lowSNR regime. In [29], the authors developed a pair of ADC bit allocation strategies for minimizing the quantization error effects under a total ADC power constraint, thereby achieving an improved performance. However, the separate design of the ADC quantization bit allocation and hybrid beamforming matrices tends to suffer from performance degradation. Hence, the authors of [30] jointly optimized both the on/off modes of the RF processing chains and the number of ADC quantization bits.
As a further development, the authors of [31] aimed for jointly optimizing the sampling resolution of ADCs and the hybrid beamforming matrices, which results in energy efficient solutions for pointtopoint mmWave MIMO systems. In light of [31], the authors of [32] extended the joint design to multiuser systems, hence achieving a significantly improved EE compared to the existing schemes. The potential advantages of RADCs in the context of various practical systems have also been reported in [33, 34, 35]. The authors of [33] focused their attention on the uplink of mmWave systems using RADCs and investigated the associated joint resource allocation and user scheduling problem. In [35] the design of the reconfigurable intelligent surface (RIS) aided mmWave uplink system relying on RADCs was investigated, demonstrating that an RIS is capable of mitigating the performance erosion imposed by RADCs.
Nevertheless, the aforementioned studies are mainly based on perfect instantaneous channel state information (CSI), which is assumed to be known at the BS. In practice, the acquisition of perfect CSI cannot be achieved in massive MIMO systems due to the inevitable channel estimation errors [36]. Therefore, how to efficiently design the pilot signals for improving the precision of channel estimation is of paramount importance. In practice a codebookbased pilot scheme is preferred, where orthogonal pilot sequences are chosen from a given codebook as a benefit of its lowcomplexity implementation and low feedback overhead [37, 38]. However, allocating mutually orthogonal pilot sequences to a large number of users for avoiding interference during channel estimation would require excessive pilot lengths and their orthogonality would still be destroyed upon convolution with the dispersive channel impulse response (CIR). For this reason, the carefully constructed reuse of a limited set of orthogonal pilot sequences for different users for example is of paramount importance for highprecision channel estimation. Further design alternatives were proposed for massive MIMO systems for example in [38, 39], which dispense with a codebook, hence they may be termed as codebookfree solutions. However, they tend to require a higher feedback overhead for attaining a high channel estimation accuracy.
IB Main Contributions
Despite the above advances, there is a paucity of research contributions on jointly optimizing the pilot sequences, the number of ADC quantization bits, and the HAD combiner for achieving highprecision channel estimation in the uplink of a multiuser massive MIMO system employing RADCs. Hence our inspiration is to fill this knowledgegap. In particular, both codebookbased and codebookfree pilot schemes are investigated, where for each scheme, we aim to minimize the mean square error (MSE) of the channel estimate subject to a transmit power constraint, to the constantmodulus constraint imposed on the elements of the analog combining matrix, and to the additional constraints on the number of quantization bits. In a nutshell, the main contributions of this paper over the existing literature lie in the following:

We focus on jointly optimizing the pilot sequences, the HAD combiners and the RADC bit allocation in the presence of a correlated Rayleigh fading channel model. The channel estimation mean square error (MSE) minimization problem is formulated, which only requires the knowledge of channel statistics under practical operating conditions.

We first transform the highly nonconvex optimization problem into an equivalent but more tractable form by introducing auxiliary variables and employing fractional programming (FP) techniques. Then, we develop a new block coordinate descent (BCD) based algorithm for a codebookfree channel estimation scheme and a penalty dual decomposition (PDD) based algorithm for a codebookbased channel estimation scheme. Both of these iterative algorithms ensure convergence to the set of stationary solutions of the original optimization problem. Furthermore, the computational complexity of the proposed algorithms is analyzed.

A simplified lowcomplexity algorithm is also presented for the codebookbased channel estimation scheme, which solves the MSE minimization problem suboptimally but efficiently.

To characterize the benefits of our proposed algorithms, we provide exhaustive simulation results in terms of the MSE, sum rate and feedback overhead for a range of pertinent system settings. We demonstrate that through the coordinated allocation of bits to the RADCs, the proposed algorithms can beneficially exploit the knowledge of channel statistics to accomplish the pilot design, while minimizing interference and improving the channel estimation accuracy.
IC Organization and Notation
This paper is organized as follows. In Section II and III, we introduce the investigated system model and formulate the optimization problem for the constrained channel estimation, respectively. In Section IV and V, we propose efficient algorithms by solving the formulated problems for the codebookfree and codebookbased pilot schemes, respectively. Section VI provides simulation results to appraise the performance of the proposed algorithms. The paper is concluded in Section VII, whilst proofs and detailed derivations appear in the Appendices.
Notations: For a matrix , , , , and
denote its transpose, conjugate, conjugate transpose and vectorization, respectively.
denotes the element at the intersection of row and column . For a square matrix , , , and represents its trace, inverse and Frobenius norm, respectively. denotes a diagonal matrix consisting of the diagonal elements of .denotes an identity matrix. For a vector
, denotes a diagonal matrix with along its main diagonal and denotes the Euclidean norm of vector . The symbol denotes the Kronecker product. and respectively denote the real and magnitude parts of a complex number. denotes the largest integer less than or equal to and denotes the smallest integer greater than or equal to . We let () denote complex (real) space. denotes the expectation anddenotes the circularly symmetric complex Gaussian distribution with mean 0 and variance
.Ii System Model
As shown in Fig. 1, we consider a multiuser massive MIMO uplink system that adopts a static fullyconnected hybrid AD combining structure with RADCs at the BS. The BS which is equipped with antennas and RF chains, serves singleantenna users simultaneously. The baseband output of each RF chain is fed to a dedicated RADC that employs variable bit resolution to quantize the real and imaginary parts of each analog signal. Moreover, we assume that the BS and users are fully timesynchronized.
Iia Channel Model
Without loss of generality, we consider a narrowband correlated channel model. Let represent the uplink channel from user to the BS. Then, the channel vector can be expressed as
(1) 
where is a vector with independent and identically distributed (i.i.d.) elements distributed as , and denotes the channel covariance matrix for user . describes the spatial correlation properties of the channel due to macroscopic effects of propagation, including pathloss and shadowing. When the users are quasistationary, the pathloss and shadowing can be readily obtained based on the distance between the BS and user and stored at the BS as a priori [40],[41].
(9) 
IiB Pilot Sequences
In this work, we focus on two types of pilot sequences, i.e., the codebookfree and codebookbased pilots. For ease of exposition, we denote as as the pilot sequence transmitted by user , where is the length of the pilot sequence during each coherence interval. It is noteworthy that is predetermined based on the coherence budget.
1) Codebookfree pilots: As in previous works [38],[42], we assume that each pilot sequence can be arbitrarily selected from the dimensional space under the power constraint:
(2) 
where denotes the transmit power budget for user .
2) Codebookbased pilots: We denote the available codebook as , where denotes the th () potential pilot sequence. It is assumed that the different pilot sequences meet the orthonormality conditions, i.e., , and . Then, the pilot sequence of user is constructed as
(3) 
where denotes the transmit power of user and denotes the codebook sequence allocated to user .
IiC Uplink Training
In the uplink training phase, the BS estimates the uplink channels based on the pilot sequences simultaneously transmitted from the users. Focusing on the th user, the received pilot signal at the BS can be expressed as
(4) 
where the first term represents the desired contribution from user , the second term represents multiuser interference, and denotes the additive complex Gaussian noise matrix with i.i.d. entries following the distribution .
An analog combining matrix is employed to process the received signal at the BS with the goal of suppressing interference from the other users. In the HAD architecture, the analog combiner is typically implemented using phase shifters [9], which imposes constantmodulus constraints on the elements of the matrix . The output of the analog combiner is given by
(5) 
We employ RADCs to quantize as shown in Fig. 1, which enables flexible quantization bit allocation for each baseband channel according to the radio propagation characteristics. Such a refined design can efficiently mitigate the quantization errors and greatly improve the system performance with reduced hardware cost and power consumption. Let integer denote the number of available quantization bits of RADC . Assuming that the gain of automatic gain control is appropriately set, the additive quantization noise model (AQNM) can be employed to reformulate the quantized signal [29]. Then, based on the AQNM, the quantized output is specialized to
(6) 
where is the elementwise quantization function, is a diagonal gain matrix. Here, the quantization gain is a function of the number of quantization bit and defined as , where is a normalized quantization error. For , can be expressed exactly in terms of [29], while for , they can be approximated by . is the additive quantization noise which is independent of . To facilitate analytical derivations, we vectorize and obtain . obeys the complex Gaussian distribution with zero mean and covariance matrix
(7)  
where and .
Finally, we leverage the digital processing techniques for quantization loss mitigation and interference cancellation. Specifically, the retrieved signal of user at the output of the digital combiner is expressed as
(8) 
Iii Problem Statement
In this section, we first introduce the minimum MSE (MMSE)based estimator and then formulate the problem under investigation.
Iiia MMSE Channel Estimation
The BS aims to estimate the channel based on the received pilot signal . To facilitate analytical derivations and achieve the estimation of the desired channel using the MMSE estimator [42], we vectorize in (8) and obtain shown at the top of this page.
Defining as the MMSE estimate of the channel , we have
(10) 
where we define , , and We note that matrix is positive definite and therefore invertible. The corresponding MSE of user is given by
(11) 
The detailed derivations of and are shown in Appendix A. Then, the total MSE for the estimation of all the user channels can be expressed as
(12) 
IiiB Problem Formulation
We note that the effectiveness of hybrid beamforming mainly depends on the accuracy of CSI. In this work, we concentrate on the joint design of the pilot sequence, HAD combiner, and the allocation of ADC quantization bits, aiming to minimize the total MSE (12) for the given system model. To simplify notations, we introduce , , and . Since the matrices in (12) do not depend on the optimization variables , and , minimizing the is equivalent to maximizing . Hence, we consider the following optimization problem
(13a)  
s.t.  (13b)  
(13c)  
(13d)  
(13e) 
where and respectively denote the minimum and maximum number of quantization bits (), denotes the average number of quantization bits and is provided as the total budget of quantization bits at the BS. Constraint (13c) is imposed to enforce constantmodulus on the elements of analog combining matrix . Constraint (13d) limits the range of the quantization bits for each RADC, while constraint (13e) gives a threshold on the total ADC quantization bits at the BS.
It should be emphasized that problem (13) is extremely difficult to solve due to the constantmodulus constraints, the nonconvex mixedinteger feasible set, and the highly nonconvex objective function with matrix ratio term . To be specific, in the matrix fractional structure of the objective function, continuous variables and discrete variable appear in both the denominator and the numerator, which makes the problem intractable.
Iv Proposed BCDBased Algorithm for the CodebookFree Pilot Scheme
In this section, we focus on the codebookfree channel estimation where the pilot sequences meet condition (2). We first convert problem (13) into an equivalent and mathematically tractable one based on the FP method. Then, an efficient BCDbased joint design algorithm is proposed to solve the resulting problem, where a series of subproblems can be tackled via alternating optimization.
Iva Problem Transformation
With the aid of the advanced matrix FP techniques [43], we employ the ratiodecoupling approach to transform problem (13) into a more tractable yet equivalent form. To this end, we first introduce the auxiliary variable for each ratio term . Then problem (13) can be converted into the equivalent problem
(14a)  
s.t.  (14b)  
(14c) 
It is observed that the constraint regarding to in problem (14) are separable with respect to the other variables, i.e., .^{1}^{1}1Actually, matrix is not involved in the constraints. When these variables are fixed, each auxiliary variable can be optimally determined as follows:
(15) 
The detailed proofs of the equivalence between problem (13) and problem (14), as well as of the optimal solution (15) for are deferred to Appendix B. Using the matrix quadratic transformation, where the cost function in (13a) is replaced by that in (14a), we effectively decouple the numerator and denominator in each term and avoid the difficulties posed by the nonconvex fractional objective function.
IvB Proposed BCDBased Algorithm
Note that the constraints (13b)(13e) in problem (14) are uncoupled with respect to the variables , i.e., each one of the constraints involve only one of these variables at a time. Hence, to reach a solution, we can decompose (14) into several independent subproblems each involving a single variable and solve problem (14) by means of the BCD algorithm. The corresponding developments are elaborated in further details below.
1) Optimization of : In order to cope with the difficulties posed by the discrete integer variables , we first relax into a continuous value , solve the resulting problem for , and finally round each optimal continuous value to the nearest integer [32]. To determine the best integer quantization bits and efficiently control quantization error, we employ the following criterion for :
(16) 
where is properly chosen so that is satisfied. Notice that simple rounding to with can always satisfy constraint (13e) and reduce the power consumption while increasing the MSE and quantization errors. On the other hand, rounding to with may violate the constraint (13e), i.e., . Hence, considering (16), we apply the bisection method to find the optimal value of [44] and consequently, determine the optimal allocation of the quantization bits, which greatly achieves the quantization error control.
We denote as the vector of continuous variables after relaxation, to be used in (14) in place of . With the remaining variables being fixed, the subproblem for is formulated as
(17a)  
s.t.  (17b)  
(17c) 
where the objective function is expressed as
(18) 
It is rather challenging to globally solve a nonconcave optimization problem such as (17), due to the nature of the objective function. To address this difficulty, we therefore resort to successively solving a sequence of strongly concave approximate problems. Specifically, by virtue of the successive concave approximation (SCA) method [33], we construct a concave surrogate function in lieu of the objective function (IVB) at each iteration. The surrogate function used at iteration takes the form
(19) 
where is the gradient of with respect to at the current point
(which is calculated based on the chain rule),
is a positive constant, and the term is used to ensure the strong concavity of . Therefore, at the th iteration of the SCA algorithm, we need to solve the following linearly constrained quadratic surrogate problem to update :(20a)  
(20b) 
whose solution can be efficiently obtained using the offtheshelf CVX solver [45]. Accordingly, we can find the optimal in an iterative fashion, and finally apply the procedure in (16) to obtain the desired integer solution .
2) Optimization of : By keeping the other variables fixed, the subproblem for becomes a quadratic optimization problem with constantmodulus constraints, which is given by
(21a)  
s.t.  (21b) 
To handle the nontrivial constantmodulus constraints resulting from the analog combiner, we use the oneiteration BCDtype algorithm [46] to recursively solve problem (21). The detailed derivation is shown in Appendix C.
3) Optimization of : Similarly, the corresponding unconstrained subproblem for is given by
(22) 
where each term in the sum only involve the corresponding vector . Therefore, problem (22) can be further decomposed into a sequence of simple peruser cases, each one being a quadratic optimization problem. These peruser subproblems can be efficiently solved by the first order condition . Specifically, the optimal value of can be derived as
(23) 
4) Optimization of : We now turn to the optimization of the pilot matrix while fixing the other variables . In this case, the key step is to rewrite the objective function (14a) in a quadratic form with respect to , which results into
(24) 
where
(25) 
with being the th row vector of the matrix , and being the th vector on the column of , and is a constant term independent of variable . Therefore, the subproblem for the codebookfree pilot matrix is given by
s.t.  (26) 
which can be solved by the Lagrange multiplier method. By associating the Lagrange multiplier to the corresponding power budget constraint , the Lagrange function for (26) is given by
(27) 
By examing the first order optimality condition for , the optimal value of can be derived as
(28) 
where should be optimally determined as
(29) 
Note that (28) and (29) can be readily solved via the bisection search; ultimately, we substitute the optimal into (28) and obtain the optimal .
The corresponding BCDbased algorithm is summarized in Algorithm 1.
IvC Convergence Analysis and Computational Complexity
This subsection establishes the local convergence of Algorithm 1 to stationary solutions and presents its detailed computational complexity analysis. First, we introduce a key lemma, which can be readily proved according to [43].
Lemma 1
By invoking Lemma 1, the convergence of Algorithm 1 can be demonstrated. This property is summarized as Theorem 1 below, and its proof can be found in Appendix D.
Theorem 1
In the following, we analyze the computational complexity of Algorithm 1. We use the number of multiplications as a measure of complexity and assume that . Updating the auxiliary variables in (15) involves the calculation of , and the inverse of based on GaussJordan elimination with an overall complexity . According to the proposed oneiteration BCD type algorithm [46], updating all the entries of once has a complexity of . Furthermore, the overall computational complexity of is on the order of . The computational complexity of optimizing is dominated by computing the Jacobian matrix of with respect to . Thus, the complexity for updating is , where denotes the number of iterations of the SCA method. As for the pilot matrix , using the bisection method to search each Lagrangian parameter requires iterations to achieve a desired accuracy, where is the initial interval size and is the tolerance. Hence, the overall computational complexity of updating over all the users is . By retaining dominant terms, the overall complexity of the proposed Algorithm 1 is , where denotes the number of iterations.
V Proposed PDDBased Algorithm for the CodebookBased Pilot Scheme
In this section, we focus on the codebookbased channel estimation where pilot sequences are chosen from the codebook . Under this setup, we first recast the corresponding problem (14) from Section IV into a resource allocation problem with discrete binary codeword indicator variables. Subsequently, we introduce a set of auxiliary variables and propose an innovative PDDbased algorithm to solve the optimization problem.
(36) 
(37) 
Va Proposed PDDBased Algorithm
Since is herein structured as where is chosen from the codebook , the codebookbased pilot design can be regarded as a pilot resource allocation problem. Consequently, how to allocate the limited orthogonal pilot in to the different users to reduce the channel estimation error is of great importance. We introduce as the allocation indicator, where signifies that the best orthogonal pilot is assigned to user ; otherwise, we have . Hence, using we can write and .
With each expressed in terms of , problem (14) can be equivalently converted to the following problem:
(31a)  
s.t.  (31b)  
(31c)  
(31d) 
where with represents the search variables, and constraint (31c) guarantees that each user is associated with a single pilot sequence.
To address the difficulty posed by discrete binary constraints (31c) and (31d), we introduce the auxiliary variables , in terms of which these constraints can be equivalently expressed as
(32) 
(33) 
where and is the th column of identity matric . Then problem (31) is then equivalent to
(34a)  
s.t.  (34b) 
where importantly, the variables are no longer limited to binary values. To solve problem (34), we next introduce the proposed PDDbased algorithm which exhibits a doubleloop structure to solve problem (34). Based on the PDD framework [47, 48], we first add a penalized version of the equality constraints in (33) to the objective function (34a), thereby obtaining the following augmented Lagrangian (AL) problem
(35a)  
s.t.  (35b) 
where , , denote the Lagrange multipliers and denotes the penalty coefficient. We note that problems (34) and (35) are equivalent in the limit , which is at the hearth of the PDD method. Specifically, in the PDDbased algorithm, the inner loop solves the AL problem with fixed AL multipliers and penalty coefficient, while the outer loop aims to update the dual variables while reducing the penalty coefficient in light of the constraint violation.
Since the constraints in problem (35) are separable, we can address the AL problem (35) in the inner loop with the BCD method. Particularly, the subproblems for , and are the same as problems (17), (21) and (22) discussed in Section IVB, respectively, and can therefore be solved using the same methods. The optimization of the remaining variables in the inner loop, i.e. , and , is explained in further detail below.
1) Optimization of : We optimize in parallel for , with the remaining variables being fixed. The subproblem of optimizing can be simplified as the unconstrained problem in (V) shown at the top of this page, where . By examining the firstorder optimality condition of (V), we derive the closedform solution of shown in (V) at the top of this page. Finally, we use the oneiteration BCD method to update based on (V).
2) Optimization of : We optimize in parallel with the other variables fixed. The corresponding subproblem for can be expressed as
Comments
There are no comments yet.