I Introduction
Massive multipleinput multipleoutput (MIMO) systems operating in the Millimeter wave (mmWave) band is a key technique candidate for future generation cellular systems to address the wireless spectrum crunch. It makes use of the frequency band from 30 GHz to 300 GHz, which provides a much wider bandwidth than current cellular systems operating in microwave bands. In addition, a short wavelength of radio signals in the mmWave band enables very large antenna arrays to be equipped at the transceivers, and this can provide significant increase of the spectral efficiency.
For mmWave MIMO systems, hybrid analog and digital precoding architectures have been proposed to achieve high spectral efficiency with low cost and power consumption. Extensive work has been devoted to designing hybrid precoding algorithms under perfect channel state information (CSI) and different constraints [1, 2, 3, 4, 5, 6, 7, 8]
. However, it is difficult to obtain the perfect CSI in mmWave MIMO systems. The reason is that the channel matrix measured at the baseband cannot be obtained directly because it is intertwined with the choice of analog precoders. Furthermore, conventional MIMO channel estimation is incapable of utilizing array gain in mmWave systems, and it leads to low signaltonoise ratio (SNR). Therefore, the conventional channel estimation requires long training sequences to estimate mmWave MIMO channels, which is impractical due to fast variation of mmWave MIMO channels.
To address the challenge of training overhead, [9] proposed a hybrid precoding algorithm for singleuser MIMO systems with partial knowledge of the CSI. For the multiuser MIMO scenario, [10] devised a mixCSIbased hybrid precoding structure, where the analog precoding design is based on the slowvarying channel statistics, and the digital precoding design is based on the instantaneous CSI. Then the dimension of the effective channel matrix (instantaneous CSI) is greatly reduced. However, the work in [9] and [10] considered only the fullyconnected hybrid architecture, which requires much more phase shifters compared to the partiallyconnected structure [11]. In the partiallyconnected structure, the antenna array is partitioned into a number of smaller disjoint subarrays, each of which is driven by a single radio frequency (RF) chain[12]. This structure is an extension of classic antenna selection methods, which allocate each RF chain to an antenna element [13]. In [14], the authors developed a successive interference cancellation based hybrid precoding for partiallyconnected structure with fixed subset of antennas. The partiallyconnected structure with dynamic subset of antennas is considered in [15], and a low complexity greedy algorithm is also proposed to design the best partitioning/grouping of antennas over the RF chains.
Furthermore, most existing works on hybrid precoding assume Gaussian inputs, which are rarely realized in practice. It is well known that practical systems utilize finitealphabet inputs, such as phaseshift keying (PSK) or quadrature amplitude modulation (QAM). Precoding designs under Gaussian inputs have been shown to be quite suboptimal for practical systems with finitealphabet inputs [16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]. Recently, the authors in [28] presented a BroydenFletcherGoldfarbShanno based hybrid precoding algorithm for mmWave MIMO systems with finitealphabet inputs. The proposed algorithm utilizes both gradient and Hessian information, and simulation results showed that it outperforms existing hybrid precoding algorithms including [3, 5, 6, 8].
Ia Contributions
In this paper, we investigate the hybrid precoding design for mmWave MIMO systems with finitealphabet inputs under the following assumptions: 1) the system employs partiallyconnected hybrid precoding structure with dynamic subset of antennas; 2) the partition of antennas and analog precoder are designed based on statistical CSI, and the digital precoder is designed based on either statistical CSI or instantaneous CSI. We consider the statisticalCSIbased scenario and the mixedCSIbased scenario, and the corresponding hybrid precoding problems under two scenarios have the same mathematical form. Then we propose a manifoldbased gradient ascent algorithm to solve the hybrid precoding problem. The contributions of this paper are summarized as follows:

We present a simple criterion to design the best partition of antennas using statistical CSI. The corresponding dynamic subarray design is a (nonconvex) combinatorial optimization problem, and we propose a low complexity algorithm to solve this problem.

We derive a lower bound of the average mutual information for mmWave MIMO channels. The lower bound plus a constant shift serves as a very accurate approximation to the average mutual information, and its complexity is much lower than the original average mutual information. To further reduce the complexity, we also derive an accurate approximation of the proposed lower bound.

We propose a manifoldbased gradient ascent algorithm to design hybrid precoders. Simulation results show that 1) the proposed algorithm converges to a near globally optimal solution from arbitrary initial points; 2) the performance of mixedCSIbased hybrid precoding is very close to that of instantaneousCSIbased hybrid precoding. 3) the statisticalCSIbased hybrid precoding can achieve higher energy efficiency than the fullyconnected hybird precoding.
IB Notations
The following notations are adopted throughout the paper: Boldface lowercase letters, boldface uppercase letters, and calligraphic letters are used to denote vectors, matrices and sets, respectively. The real and complex number fields are denoted by
and , respectively. The superscripts , and stand for transpose, conjugate, and conjugate transpose operations, respectively. is the trace of a matrix; denotes the Euclidean norm of a vector; represents the Frobenius norm of a matrix; represents the statistical expectation with respect to ; represents the th element of ; anddenote an identity matrix and a zero matrix, respectively, with appropriate dimensions;
represents the Hadamard matrix product; represents the mutual information; and are the real and imaginary parts of a complex value; is used for the base two logarithm.Ii System and Channel Models
In this section, we present system and channel models for mmWave MIMO systems.
Iia System Model
Consider a pointtopoint mmWave MIMO system, where a transmitter with antennas sends data streams to a receiver with antennas. The number of RF chains at the transmitter is , which satisfies . We consider the hybrid precoding scheme, where data streams are first precoded using a digital precoder, and then shaped by an analog precoder. The received baseband signal can be written as
(1) 
where is the mmWave channel matrix; is the analog precoder; is the digital precoder; is the input data vector and is the independent and identically distributed (i.i.d.) complex Gaussian noise with zeromean and covariance . To simplify our system model, we omit the analog and digital combiner, which can be designed similarly as the hybrid precoder.
In this paper, the analog precoder is implemented by a dynamic phase shifter subarray, where each RF chain is connected to a dynamic subset of transmit antennas. Let denote the collection of transmit antennas connected to th RF chain. We partition transmit antennas into subsets satisfying
(2) 
Since each RF chain can be connected to different number of antennas, the cardinalities of are different. In addition, if the th transmit antenna is connected to the th RF chain, i.e.,, the th entry of has unit modulus, otherwise it is zero. Therefore, the constraints on can be expressed by
(3) 
where is the indicator function:
(4) 
The transmitted signal is restricted by a total power constraint :
(5) 
To decouple and in coupled power constraint (5), we consider the following change of variables:
(6)  
(7) 
Then the power constraint in (5) becomes
(8) 
and the constraints on can be expressed by
(9) 
Furthermore, by plugging and into the system model in (1), we have
(10) 
Combining (8) and (10), we observe that and can be regarded as the effective channel and precoder for typical MIMO Gaussian channels, respectively. Since there exists a onetoone mapping between and , we will focus on designing the effective analog and digital precoders throughout the rest of this paper.
IiB Channel Model
The mmWave MIMO channel is characterized by a standard multipath model[29, ch. 7.3.2]:
(11) 
where denotes the number of physical propagation paths between the transmitter and the receiver; represents the complex gain of the th propagation path; We assume that
are i.i.d. complex Gaussian distributed with zeromean and unitvariance;
and represent the receive and transmit array steering vectors, with and being the angles of arrival (AOA) and the angles of departure (AOD), respectively. In this paper, the transmitter and receiver adopt uniform linear arrays, whose array steering vector is given by(12) 
where is the number of antenna element, is the wavelength of the carrier frequency and is the antenna spacing.
The channel model in (11) can be rewritten more compactly as
(13) 
where ; and are stacked array steering vectors of AOA and AOD respectively, given by
(14)  
(15) 
This work assumes that the small scale fading varies rapidly while the variation of angle information and is slow[30]. Since the angle information changes slowly, we further assume that the transmitter can obtain statistical CSI through feedback, i.e., the transmitter knows and .
Iii Problem Formulation
For mmWave MIMO systems, it may not be practical to obtain the instantaneous CSI by conventional channel estimation techniques because 1) the channel matrix measured in the baseband depends on the choice of analog precoder; 2) the training blocks may be prohibitively long due to the large bandwidth and low signaltonoise ratio (SNR). To mitigate this difficulty, we propose new formulations in which analog and/or digital precoders are designed under statistical CSI.
Iiia StatisticalCSIBased Formulation
We assume that the transmitter has the knowledge of statistical CSI, including , and the distribution of . Then we design the analog and digital precoder to maximize the average mutual information. Suppose each entry of the input data vector
is uniformly distributed from a given constellation set with cardinality
. The average mutual information between and is given by(16) 
where is the instantaneous mutual information between and [20]
(17) 
Here is a constant number, and , with and being two possible data vectors taken from . The average mutual information maximization problem can then be formulated as
(18) 
where is the maximum average mutual information with given partition of subsets, i.e.,
(19) 
Problem (18) is a combinatorial optimization problem for which finding the optimal solution requires an exhaustive search over all nonempty in . The total number of combinations is known as Stirling number of the second kind [31] and is given by
(20) 
Then we can rewrite problem (18) as
(21) 
where represents the th given partition of subsets belonging to .
Although (21) provides a theoretically possible way for solving problem (18), its computational complexity is prohibitive even for a small number of transmit antennas and RF chains. For example, when and , is equal to , which implies that we need to solve problem (19) over ten million times to obtain the optimal analog and digital precoder.
We propose a new formulation to reduce the computational complexity of problem (18). Recall that represent positions of nonzero entries in , and the role of is to reshape the effective channel matrix . Therefore, we design and the corresponding such that the average effective channel gain is maximized. The dynamic subarray design problem can then be formulated as
(22) 
We solve problem (22) to obtain its optimal solutions, denoted by and . Then we solve problem (19) with given to obtain the optimally effective analog and digital precoders . Note that since is not obtained by maximizing the average mutual information, we do not use it directly as the optimally effective analog precoder. However, the solution serves as a good initial point for solving problem (19). Therefore, we first design a low complexity algorithm to solve problem (22), and then design an effective algorithm to solve the hybrid precoding problem (19) with given .
IiiB MixedCSIBased Formulation
The basic idea of mixed CSI based formulation is to design the analog precoder based on statistical CSI, and then estimate the reduceddimensional effective channel matrix , where is the optimally effective analog precoder based on statistical CSI. After that, the transmitter utilizes the instantaneous effective channel matrix to design effective digital precoder , and this is a typical MIMO precoding problem. In this case, the burden of channel estimation is greatly reduced because the dimension of is much smaller than that of .
Given the instantaneous effective channel matrix , the digital precoding problem can be expressed by
(23) 
where is the maximum mutual information under the given effective channel matrix . Then the mixedCSIbased hybrid precoding problem can be formulated as
(24) 
Problem (24) is intractable because it is prohibitive to compute the objective function . In order to estimate at a given point , we need to solve the nonconvex problem (23) thousands of times for randomly generated channel matrix . To mitigate this difficulty, we replace by a computationally efficient bound. Invoke Jensen’s inequality, can be lower bounded by
(25) 
Replacing by its lower bound, problem (24) is approximated as
(26) 
which is exactly the same as problem (18). Then we can use the same procedure to solve this problem, i.e., we first solve problem (22) to obtain , and then solve problem (19) with given to obtain the optimally effective analog precoder. Note that although the statisticalCSIbased formulation and the mixedCSIbased formulation solve the same optimization problem, there is an important difference between them. The optimization variable in the mixedCSIbased formulation is just an auxiliary variable made for analog precoder design. After obtaining the optimally effective analog precoder, the real digital precoder should be obtained by solving problem (23).
Iv Dynamic Subarray Design
In this section, we propose a low complexity algorithm to solve problem (22). Note that the objective function in problem (22) can be rewritten as
(27) 
where the second equality in equation (IV) holds because . Plugging into equation (28), we obtain the following problem
(28)  
It is difficult to solve problem (28) directly because the feasible set of problem (28) is characterized by and . To address this issue, the following proposition rewrites the feasible set as explicit constraints of .
Proposition 1
The feasible set of problem (28) can be expressed by
(29) 
where denotes the th row of , and represents the total number of nonzero elements in a vector.
Proof:
See Appendix.
According to Proposition 1, we rewrite problem (28) as
(30)  
Problem (30) is still intractable due to nonconvex discrete constraints and . Therefore, we first drop the constraints and consider the unconstrained problem
(31) 
Problem (31
) is a generalized eigenvalue problem, and its optimal solution is given by
[15](32) 
where is the left singular vectors of corresponding to the largest singular values, and
is an arbitrary unitary matrix. Note that when
, the remaining left singular vectors in can be chosen arbitrarily as long as satisfies .In general, if there exists a unitary matrix such that the unconstrained optimal solution satisfies (29), then is the globally optimal solution of problem (30). However, such may not exist and thus we use to find a nearby feasible solution. Specifically, consider the following optimization problem
(33)  
where denotes the set of unitary matrices. Since the optimization variables and are separate, we adopt the alternating minimization approach to solve problem (33).
Given , the optimal of problem (33) has a simple closed form solution. Let , then the optimal of problem (33) can be expressed by
(34) 
Given , problem (33) is reduced to an orthogonal procrustes problem
(35) 
Let the singular value decomposition of
be(36) 
where is a unitary matrix with left singular vectors, is a diagonal matrix with singular values arranged in decreasing order, and is another unitary matrix with right singular vectors. Then the optimal solution of problem (35) is given by [32]
(37) 
Combining (34) and (37), we propose a simple alternating minimization algorithm to solve problem (33) and obtain the corresponding near optimal partition of subsets . The details of this algorithm is summarized in Algorithm 1.
We conclude this section with several remarks on Algorithm 1:

The convergence of Algorithm 1 is guaranteed because the objective function is bounded, and it is decreasing in each iteration.

Since problem (33) is a nonconvex problem, the solution obtained by Algorithm 1 depends on the initial unitary matrix . Therefore, we can run Algorithm 1 several times with different initial , and then choose the solution corresponding to the largest .

When is determined, the corresponding is given by
V Hybrid Precoding With FiniteAlphabet Inputs
In this section, we first derive the lower bound for the average mutual information , and then propose an effective algorithm to design analog and digital precoders.
Va Lower Bound For Average Mutual Information
It is difficult to compute and optimize the average constellationconstrained mutual information directly because both and its gradient have no closed form expressions. To estimate as well as its gradient, we need to use Monte Carlo method and/or numerical integral, whose computational complexity are prohibitively high.
This difficulty can be partially mitigated by the following proposition, which provides the lower bound of in closed form.
Proposition 2
The average constellationconstrained mutual information of mmWave MIMO channels can be lower bounded by
(38) 
where
(39) 
Proof:
See Appendix.
The computational complexity of the lower bound is still very high because it needs to calculate the determinant times. For example, when we adopt 16QAM modulation () and the number of data streams is 4, is equal to . To further reduce the complexity, we notice that the receive steering vectors are asymptotically orthogonal to each other when the number of receive antennas approaches infinity, i.e., . Based on this observation, we derive a low complexity approximation of in the following proposition.
Proposition 3
The lower bound can be approximated by
(40) 
where , with being the th column of . In addition, the limit of is as approaches infinity.
Proof:
See Appendix.
The accuracy and computational complexity of the lower bound and its approximation will be shown in Fig. 1 and Table 1 in the simulation result section.
VB Hybrid Precoding Design
In this section, we solve the hybrid precoding problem (19) with given obtained by Algorithm 1. First, by replacing the average mutual information with the approximated lower bound , problem (19) can be approximated as
(41)  
Note that the constraint implies that only the phase of nonzero can be changed. Therefore, instead of using as the optimization variable, it is more convenient to optimize the phase of nonzero entries in . Define the phase matrix as
(42) 
where represents the phase of . Then can be expressed as
(43) 
Using as the optimization variable and defining a new function , problem (41) can be rewritten as
(44)  
Here we express the power constraint as because is monotonically increasing with respect to . Then we provide the gradient of in the following proposition, which forms the foundation for solving problem (44).
Proposition 4
The gradient of with respect to and are given by
(45) 
where
(46) 
with
Proof:
See Appendix.
We propose a manifoldbased gradient ascent algorithm to optimize and simultaneously using the gradient information. At the th iteration, the algorithm updates the current solution to by the following rules
(47) 
where is the stepsize, , and is the gradient of on the following (sphere) manifold
(48) 
Based on the definition, can be computed by projecting onto the tangent space at , where is given by
(49) 
Then can be expressed by
(50) 
Using the standard Lagrangian multiplier method, the closed form solution of problem (50) is given by
(51) 
After obtaining the ascent direction, we need to determine the stepsize such that the objective function is increasing in each iteration. We propose a modified backtracking line search method, which is usually more efficient than the classic backtracking line search [33]. The main idea is to use as the initial guess of , and then either increases or decreases it to find the largest such that
where is a constant to control the stepsize. Specifically, the stepsize is set as
(52) 
where is the smallest integer such that , and is the smallest integer such that . The details of our proposed manifoldbased gradient ascent algorithm is summarized in Algorithm 2.