Massive multiple-input multiple-output (MaMIMO) is a physical layer technology in which the BSs are equipped with a large number of antennas . This allows MaMIMO systems to spatially multiplex tens of users on the same time-frequency resources, which greatly enhances the spectral efficiency of the network . Furthermore, simple signal processing techniques, such as linear precoding and detection schemes, are close to optimal due to the quasi-orthogonality of the channels . These features makes MaMIMO a key technology for future wireless networks .
A crucial design criterion for G systems is improved energy efficiency . For MaMIMO systems, it has been shown that the required downlink transmission power to maintain a constant SINR is inversely proportional to the number of BS antennas . However, the total power consumption does not necessarily decrease with increased number of antennas when using a realistic power consumption model that also takes the hardware-consumed power into account [7, 8].
There are various studies on the energy efficiency of MaMIMO systems, which is commonly defined as the ratio between data rate and the power consumption. Numerical analysis has shown that MaMIMO systems have the potential to improve the energy efficiency up a factor of compared to a setup with typical LTE BSs . For a given uplink sum rate, the optimal numbers of BS antennas and users are investigated for a single-cell system in . However, the cost of acquiring channel state information (CSI) is ignored, which can lead to misleading results. In , the joint uplink-downlink energy efficiency is maximized with respect to the number of BS antennas, data rate, number of users, and transmission powers. In a multi-cell setup, the energy efficiency optimization problem with respect to the number of BS antennas is investigated in , which assumes perfect CSI. The total power consumption for a single-cell system is minimized in  under a low traffic assumption. These prior works can provide general insights into the deployment of energy-efficient networks, but when the network is in place, one cannot optimize the data rates or number of users—these are given by the actual user traffic.
When the traffic load is low, the energy efficiency of a multiantenna systems can be increased by utilizing only a subset of the available BS antennas, referred to as antenna selection. Such an approach allows the BS to utilize only the antennas that, for the current small-scale fading realization, provide a high contribution to the SNR. Antenna selection based on small-scale fading realization was recently studied in the context of MaMIMO [14, 15]. It has been shown that antenna selection provides a significant increase in capacity when the number of antennas is greater than RF chains and perfect CSI is available 
. A genetic algorithm for antenna selection that is capable of optimizing different objective functions is proposed in. In , a subset of the antennas is selected in a decentralized manner by the receiving users under a setup with imperfect CSI. An iterative water-filling scheme for antenna selection is presented in . A detailed comparison of antenna selection approaches is presented in 
. In principle, the unselected antennas can be turned off to save energy, but in wideband systems with many subcarriers it is unlikely that a certain antenna is simultaneously unselected on all subcarriers; thus, antennas can generally not be turned off. Furthermore, antenna selection based on small-scale fading realizations requires all antennas to be turned on during the channel estimation, thus the sleep time is very short.
In this work, we consider the downlink of a MaMIMO system and minimize the total power consumption by jointly optimizing the transmission powers and the number of active antennas, with given constraints on the maximum transmit power, the required effective SINRs, and available number of antennas. The optimization is based on the large-scale fading coefficients, not the small-scale fading realizations, which enables us to turn off antennas to save power when the traffic load is low. For downlink precoding, maximum ratio transmission (MRT) and zero-forcing (ZF) are considered. A conference version of this work has been presented at , where only a single-cell setup is considered. This paper contains more general and complete results for the single-cell case along with the investigation of multi-cell setups. To summarize, the main contributions of this work are as follows:
The number of BS antennas and transmission powers are jointly optimized, while satisfying individual power and SINR constraints for both single-cell and multi-cell setups. The feasibility conditions are manifested analytically. The problem can be infeasible for a given number of antennas, but always becomes feasible as the number of antennas increases for a single-cell system. (Lemmas 1-2)
We prove that increasing the number of BS antennas in any cell does not deteriorate the performance of the overall system. (Lemma 7)
For the multi-cell case, we reformulate the joint optimization problem as a geometric programming problem that can be solved efficiently.
I-a Organization and Notation
Vectors and matrices are denoted by boldface lowercase and uppercase letters, respectively. The superscripts and represent the transpose and conjugate transpose. is the identity matrix. The inverse of a matrix is denoted by and is its spectral radius. are the sets of real numbers and integers whereas the strictly positive integers are represented by . The trace operator is denoted by and is used for the Euclidean norm. The -th element of a matrix is denoted by and similarly the th element of a vector is described by . For matrices/vectors operators are used for component-wise inequalities.
The rest of the paper is organized as follows. The system setup is introduced in Section II. The single-cell setup is investigated in Section III, including the closed-form solution for the joint optimization problem. In Section IV, the results are extended to multi-cell systems and the joint optimization problem is reformulated as a geometric programming problem. The numerical analysis is presented in Section V and Section VI concludes the paper.
Ii System Setup
We consider a massive MIMO system with cells, indexed by , and each cell has a BS with active antennas, out of available antennas. Cell contains users that simultaneously communicate with its BS. The system operates in time-division duplex (TDD) mode, where time-frequency resources are divided into coherence intervals, such that each channel is constant and frequency-flat in each interval. The channels are assumed to take independent realizations in each coherence interval from stationary ergodic processes. A coherence interval consists of three phases: uplink data transmission, uplink training, and downlink data transmission. In this paper, we focus on the uplink training and downlink data transmission, while we leave uplink data transmission analysis as future work.
Let denote the length of the coherence interval (in samples) and assume that samples are used for uplink training. The remaining samples are used for downlink data transmission. During the training, the users simultaneously transmit their pilot sequences, which are known to BS and the channels are estimated based on the received signals.
Iii Single Cell System
We first analyze a single-cell system and extend the results to multi-cell systems in Section IV. Since we consider a single cell, we drop the cell index in this section. In the training phase, the users concurrently transmit -length pilot sequences. The pilot sequences are assumed to be orthogonal and therefore the length must be greater than the number of users, i.e., . The channel between user and the BS is represented by
where denotes the large-scale fading and is the vector representing the small-scale fading. The elements of are assumed to be i.i.d. . The large-scale fading is assumed to be known at BS whereas the small-scale fading is to be estimated in each coherence interval. The minimum mean square estimator (MMSE) is utilized to obtain an estimate of . Since the channels are statistically identical across all antennas, the mean square of the channel estimate is same for all and for the th component it is given by
where denotes the uplink SNR.
In the downlink, the BS precodes and scales the data symbols to generate the transmit signal. Throughout this work, we only consider zero-forcing and maximum-ratio transmission. Although there are other methods capable of achieving somewhat better SINRs , closed-form expressions of the effective SINRs for these methods are only available in the asymptotic region. Furthermore, ZF and MRT are nearly optimal when under high and low SINR conditions, respectively . Let be the normalized transmission power for user , then the effective SINR is given by
for MRT and
for ZF. Note that the “effective SINR” corresponds to the SNR of an additive white Gaussian noise (AWGN) channel with equivalent capacity, such that is an ergodic achievable rate. Further details on the derivation for (3) and (4) are given in .
In a communication system each user has a quality-of-service (QoS) requirement determined by the user application and it must be satisfied by allocating system resources. QoS requirements may depend on various different parameters such as latency, rate, SINR, jitter etc. In this work, we assume these requirements are in the form of effective SINRs, i.e., each user has a desired SINR such that the QoS requirement is
This is equivalent to having a rate constraint of . In practice, the value of will change over time and it is important to have an efficient scheme to reallocate resources when this happens.
A system is called feasible if it is possible to satisfy the SINR requirements of each user simultaneously for a given with a positive power vector and infeasible otherwise. The system resources to be allocated are the number of BS antennas, and the transmit powers, . This is a generalization of prior works, in which the number of antennas is usually constant.
where is the normalized noise vector, is a diagonal matrix given by
and is a rank one matrix with
Notice that must be a positive integer, hence for ZF precoding must be greater than the number of users. The maximum is
where is the maximum number of antennas available at the BS.
If (6) is satisfied with equality, we have
which is a feasible solution (i.e., the desired SINR values can be satisfied simultaneously with a positive ) if and only if the spectral radius of , denoted by , is less than . Furthermore, for the feasible case, the solution given by is the unique minimum power vector, i.e., any satisfying (5) requires at least as much power component-wise [23, 24].
First, consider the power optimization problem
where is the maximum transmit power of the BS.
For a given , the minimum power solution to (P1) is given by (11) if the system is feasible. Moreover, since (11) provides the minimum power solution, it must be the solution to (P1) and satisfy the power constraints. Otherwise, (P1) has no solution with the given power constraints for the given . A key difference of multi-antenna systems is that, contrary to single-antenna systems, the number of antennas is also a variable to be optimized. We consider the problem of finding not only the optimal , but the optimal -pair for (P1).
An important property of (P1) is that even without the power constraints the feasibility of the problem is not guaranteed and depends on the number of antennas, desired SINRs, and the ratios , which is an indicator of the channel estimation quality. The desired SINRs and are fixed, whereas the number of BS antennas is an optimization parameter. Some results on the feasibility of the problem is provided below.
Rewriting (11), we have
This result suggests that it is impossible to turn an infeasible system into a feasible one by reducing the number of antennas. Next, a simple extension of Lemma 1 is stated without proof.
Let be a pair that satisfies the SINR constraints defined in (6), then there exist at least one vector for any such that also satisfies the SINR constraints.
The preceding analysis manifests the effect of changing the number of antennas on the feasibility of the system. However, the solution to (P1) must minimize the transmission powers among all possible solutions in the feasible set.
Consider the problem defined by (P1) and assume that there exists at least one feasible solution. Let the pair denote the optimal solution to the problem, then and .
First, we need to show that for any pair with satisfying (6), we have , where is the minimum power solution corresponding to . Recall that for a given the minimum power solution is given by (11). Hence, it is sufficient to compare the resulting power vectors using (11) which satisfies the SINR constraints (6) with equality, i.e.,
Since , the equality implies . Since, with increasing , the transmission powers can be reduced and it is not possible to turn a feasible system into an infeasible one by increasing the number of antennas, as shown in Lemma 2, we have . Finally, the optimal power vector is given by (11) which concludes the proof.
This theorem proves that one should keep all antennas on in a MaMIMO system to minimize the transmission power, which is rather intuitive given the power-scaling laws in .
The solution provided by Theorem 1 is valid for both precoding schemes, MRT and ZF. However, the exact values of the optimal solution are different since and are defined differently. Furthermore, the feasibility of one precoding scheme does not imply the feasibility of the other.
illustrates the change on the required transmission power as a function of the number of BS antennas, for different number of users uniformly distributed in a circular cell and ZF precoding. The simulation parameters for the numerical analysis provided throughout the paper are summarized in the TableI. For this particular example, the SINR target is for each user. A low value is chosen to guarantee the feasibility of the system when the number of antennas is small. As expected, Fig. 1 shows that the total power decreases with increasing number of antennas, which verifies the preceding analysis. The figure also shows that, as the number of users increases, the required total transmission power increases. Note that the gain from increasing is very significant when is small but diminishes as grows. Furthermore, the total transmission power grows faster than linear with respect to and decreases linearly with , which is in alignment with the results provided in .
So far, the cost of increasing the number of antennas has been neglected. In a practical system, using more antennas has a cost in terms of increased circuit power . Hence, a more practical consumption model is
where is the circuit power consumption per antenna and represents the amplifier inefficiency factor which accounts for power dissipation in the amplifiers. The joint optimization problem based on the consumption model in (14) is
Consider the following equivalent optimization problem, which is obtained by dividing the cost function by and defining the relative power cost of operating each antenna, ,
Note that we have also replaced with in the cost function, which does not change the solution.111 Note that for ZF precoding, the first term of the cost function in (P2) is equal to . Since is a constant term, the problem is equivalent to the one with and both problems have identical optimal solutions. Obviously, the optimal solution depends on and (P2) reduces to (P1) when .
denote the cost function for a given pair. Before we state our main result for single-cell systems, we first investigate the required number of antennas such that the SINR constraints can be satisfied and state the following.
Recall that (11) provides a feasible solution if and only if the spectral radius of is less than . is a rank-one matrix and . Hence, .
Lemma 3 shows that the number of active antennas can be adjusted based on the traffic load of the system as depends on the large-scale coefficients, channel estimation quality and target SINR levels.
We need the following special case of the Sherman-Morrison-Woodbury lemma[25, pg. 19]. Let be a rank one matrix, then
The transmission powers are shown to decrease as the number of antennas increases in Theorem 1. Next, we derive the minimum number of antennas required to satisfy the both power and SINR constraints of the joint optimization problem defined in (P2).
Consider the problem defined in (P2) and let be the minimum number of antennas required to satisfy both the SINR and power constraints. Then, where
Lemma 4 reveals the minimum required number of antennas such that (P2) has a solution. This provides a lower bound on the optimum number of antennas. The main result for the single-cell systems regarding (P2) is stated below.
Assume that there exists at least one -pair that satisfies the constraints in (P2) with and let denote the optimal solution, then
First, note that for a given , the transmit powers that minimize are given by (11). Hence, can be considered a function of only , given by
It is straight-forward to show that is a strictly convex function with respect to by examining the second derivative of (27). Therefore, the optimal solution can be found by taking the derivative of with respect to and equating it to zero. Let , then
which should be equal to zero when . Rearranging the terms we get
as corresponds to an infeasible system.
The optimal solution given by must satisfy the constraints and therefore must be greater than . For the case where , the optimal solution is clearly as is a convex function of .
In Theorem 2, the integer constraint on is relaxed and the resulting optimal is not necessarily an integer. However, it is clear that the optimal is either or as a result of the convexity of the cost function.
In Fig. 2, power consumption for various values and MRT precoding () are depicted. In each trial, the optimal power vector for the given is obtained using (11) and is computed. Each point represents the average of independent trials with uniformly random user locations and the crosses () represent the average of computed by (32) and the corresponding . For each , is less than the minimum obtained. However, an important point is that is not necessarily an integer whereas the curves are obtained by using integer values. As decreases, the resulting cost functions decrease and the minimum is attained at a higher as expected. It is only when is small that it is optimal to turn all antennas on, while in other cases one can save energy by turning antennas off.
Next, we investigate the relation between the optimum for MRT and ZF.
Consider the problem defined by (P2) and let and denote the optimum for MRT and ZF processing, respectively. Then,
For the case when all users have the same SINR requirements, i.e., for all , (33) reduces to . If , then the optimal number of antennas is equal for both approaches assuming that the optimal number of antennas is smaller or equal to . MRT requires more antennas for and vice versa.
Fig. 3 illustrates the optimum number of antennas for (P2) for MRT and ZF processing with . The optimal number of antennas for MRT is lower when , it is lower for ZF when , and equal when , which is in alignment with Lemma 5. In all of the considered cases, it is optimal to turn off a subset of the antennas. If we define MaMIMO as being a system with at least 50 active antennas, then we need to have tens of users and/or high QoS constraints to make MaMIMO optimal.
Note that (33) allows us to also compare the two precoding techniques in a setup with heterogeneous SINR requirements and find the optimal number of antennas. It can be concluded that if the average SINR target is smaller than 1, MRT results in fewer required antennas at the optimal solution and vice versa.
Iv Multi-Cell System
In this section, we extend the single-cell analysis in the previous section to a multi-cell setting, where both inter-cell and intra-cell interference are considered. Furthermore, it is usually not feasible to assign orthogonal pilot sequences to every user in the system, which results in pilot contamination that deteriorates the channel estimation quality and leads to coherent interference. We assume that users in a cell have mutually orthogonal pilots, however pilot sequences of two distinct cells are either completely orthogonal or replicated.
Iv-a Uplink Training and Downlink Data Transmission
We adopt the notation from . Let denote the channel between user in cell and BS . The MMSE estimate of is denoted by and contains identically distributed components with
where denote the large-scale fading coefficient and denotes the -length pilot sequence of user in cell . Note that the summation term in the denominator contains only the terms originating from the users with identical pilot sequences. The channel estimation error with MMSE, , is independent of the estimate and its elements are i.i.d. with .
Similar to the single-cell case, we consider MRT and ZF precoding. Let denote the set of cells that share the same pilot sequences with cell including the own cell, then the effective SINR for MRT is 
where is the normalized power allocation of BS for user and we assume that in cells with shared pilots th users have identical pilot sequences. Similarly for ZF, the effective SINR is given by (36).
First, we consider a multi-cell system with MRT and start with the following minimization problem:
where denotes the transmission power vector for BS and is the vector with the number of active antennas at each BS. The QoS requirement of user in cell is .
Combining the SINR constraints
in vector notation. Here is the power vector, is the normalized noise vector,
with each is a diagonal matrix where
is a diagonal matrix with
where and ; with
for MRT and
for ZF. Here, , and ; with
where , , and for both MRT and ZF.
It is straight-forward to show that the optimal solution to (P3) is obtained when the SINR constraints are satisfied with equality. Hence, the minimum power solution is
if and only if the spectral radius of is less than unity. The solution in (45) guarantees that the SINR constraints are satisfied for all users. Furthermore, it is the minimum power solution, i.e., any other solution satisfying (37) requires at least as much power component-wise . Contrary to the single-cell case, in the multi-cell case there is a coherent interference term in (37) that scales with the number of BS antennas in the pilot-sharing cells which limits the SINRs of the system. Before we investigate the effect of coherent interference on the system and the interplay between the number of BS antennas and transmission powers, a brief introduction to M-matrices is provided.
Let be a square matrix with nonpositive off-diagonal elements and nonnegative diagonal entries, then can be expressed as
where and is a nonnegative matrix. Furthermore, assume that , then is a nonsingular M-matrix .
Next, we summarize some of the properties of M-matrices which will be utilized in the succeeding analysis.
Let be a square matrix, then the following statements are equivalent [26, Chapter 6]:
is a nonsingular M-matrix.
exists and .
is a monotone matrix, i.e., for all real vectors , .
The feasibility conditions based on the spectral radius of in single-cell case and in multi-cell case actually correspond to the resulting matrix being an M-matrix.
In order to obtain the solution to (P3), we need to investigate the interplay between the number of antennas and the transmission powers. Although it is clear that increasing the number of antennas in cell will result in a smaller transmission power vector for cell , the effect on the overall system is not clear, since the number of antennas also appear in the coherent interference term. Next, we address this problem and state the following.
Since the SINR constraints are satisfied with equality, we have
The left-hand side of (48) is non-negative since
and is an M-matrix. The matrix on the right-hand side of (48) can be rewritten as,
which is an -matrix, as the feasibility of the system requires that the spectral radius of must be smaller than unity. Then, the nonnegativity of left-hand side of (48) implies that .
Lemma 7 reveals that the total transmission power can be reduced by increasing the number of antennas. Furthermore, increasing the number of antennas in one cell also results in a lower total transmission power. This shows that even in the presence of coherent interference the overall system performance can be improved by increasing the number of antennas of BSs.
Lemma 8 shows that it is not possible to turn a feasible system into an infeasible one by increasing the number of antennas, similar to single-cell case. Furthermore, combined with Lemma 7, a system with increased number of antennas will also require less transmission power. Unfortunately, Lemmas 7 and 8 do not imply that a feasible solution can always be found by increasing the number of BS antennas. Next, we consider the feasibility of the system with respect to the number of antennas and state the following.
Recall that the SINR constraints given in (37) can concurrently be satisfied if and only if the spectral radius of is less than unity. Notice that
which implies that
as all of the matrices are non-negative. Although increasing the number of antennas will reduce the spectral radius of , even in the asymptotic region where the number of antennas approaches infinity the term due to coherent interference, , does not vanish. Hence, it is not possible to make the system feasible.
A crucial difference between the single-cell and multi-cell setups is revealed in Lemma 9. In the single-cell case, and it is always possible to make a system feasible by increasing the number of antennas whereas in a multi-cell system this is not the case. This phenomenon is a result of the coherent interference caused by pilot contamination, which does not vanish with increasing and limits the achievable SINRs. The maximum SINR that can be concurrently achieved by all users is summarized in the following lemma.
Consider a multi-cell system with and let denote the maximum SINR that can be achieved by all users. Then,
and equality is satisfied only as .
It is clear that is lower bounded by . As , the non-coherent interference term vanishes and for a feasible system must be less than unity. This gives the upper bound on the stated in (56).
An interesting property of the two precoding schemes is revealed by Lemma 10
. In the asymptotic region, MRT and ZF give the same max-min SINR solution which agrees with the asymptotic analysis given in[1, Section 4.4.1].
For a multi-cell system with orthogonal pilots assigned to each user, there is no coherent interference and it is possible to reach a feasible system for any target value by increasing the number of antennas (assuming ) in the system similar to single cell case.
Next, we state the main result for the problem defined in (P3).
Consider the problem defined in (P3) and assume that there exists at least a feasible solution for for all . Then, and
where is the antenna matrix with each diagonal element equal to .
The results of Lemma 7 suggests that increasing the number of antennas results in a lower transmission power vector and it has been shown that it is impossible to transform a feasible system into an infeasible one by increasing the number of antennas. Hence, using the maximum number of antennas at each BS will give us the solution to the problem defined in (P3).
The problem of minimizing only the total transmission power in the system results in a solution which requires to use the maximum number of BS antennas in multi-cell systems, similar to the Theorem 1 in the single-cell case. The total transmission power as a function of number of antennas is depicted in Fig. 4 for a setup with MRT. However, the problem becomes more challenging when the cost of utilizing antennas is included in the cost function which is examined next.
The joint optimization problem which also includes a cost for utilizing BS antennas is