I Introduction
The fifthgeneration (5G) wireless communication system is expected to meet the everincreasing mobile traffic, predicted to be 291.8 Exabytes by 2019 [1]. The number of mobile user devices (UDs) of 5G connections is forecast to reach a figure between 25 million and 100 million by 2021 [2]. In order to provide ubiquitous service access for such huge number of UDs, the small cell concept [3] combined with massive multipleinput multipleoutput (MIMO) [4] and/or millimeterwave (mmWave) technologies [5] has emerged as one of the most promising network structures for 5G systems. However, substantial processing power is required for jointly exploiting the small cell’s spatial diversity and temporal diversity. This motivates the concept of centralized radio access network (CRAN) [6, 7], which consists of baseband units (BBUs) and remote radio units (RRUs). In a CRAN, multiple RRUs are connected to a BBU, which carries out all the baseband signal processing centrally, whilst the RRUs handle the radio frequency processing. Consequently, substantial amount of information is exchanged over fronthaul links between RRUs and their host BBU, which imposes a bottleneck on CRAN [8] and prevents its largescale practical deployment.
Ia Related Works
Extensive efforts have been devoted to the enhancement of fronthaul capacity, including compression/quantization [9, 10, 11], quality of service (QoS) guarantee [12, 13], interference mitigation [14, 15], user/access link selection [16, 17, 18], as well as resource allocation and optimization [8, 19, 20, 21].
To elaborate, compression/quantization schemes were designed for reducing the data traffic of fronthaul links to meet their capacity constraint. Explicitly, Zhou et al. [9] studied a joint fronthaul compression and transmit beamforming scheme relying on noise covariance matrix quantization in the context of traditional MIMO having a small number of antennas. Lee et al. [10] investigated multivariate fronthaul quantization motivated by the networkinformation theory, which is capable of jointly optimizing the downlink precoding and quantization at a reducedcomplexity. By contrast, Vu et al. [11] derived a block error rate metric in the context of Rayleigh fading channels for designing an adaptive compression scheme.
QoS represents a general terminology that includes packet loss ratio, bit error ratio (BER), throughput, transmission delay, etc. Since it is challenging to investigate all the QoS metrics jointly, the existing studies mainly concentrate on just a single or a few aspects of QoS. Explicitly, by relying on content caching, Zhao et al. [12] improved the linklevel effective capacity, while the dynamic network slicing scheme of [13] improved the QoS in terms of ratefairness, as well as maximum and minimum rates.
Through interference reduction, the achievable fronthaul capacity can also be increased. Hence Liu et al. [14] focused the attention on interference mitigation by exploiting the inherent sparsity of CRAN. Hao et al. [15] considered the mitigation of intracluster interference and intertier interference by exploiting coordinated multipoint transmission and by allocating distinct bandwidths to BBUs and RRUs, respectively.
Fronthaul link selection and user association, which are typically investigated together with precoding [17, 18], maximize the achievable sumrate of a CRAN, given a fixed fronthaul capacity constraint. Furthermore, Pan et al. [16] studied the joint optimization of RRU selection, user association and beamforming in the presence of imperfect channel state information (CSI). Resource allocation and optimization [8, 19, 20, 21] also offer an effective means of enhancing the achievable throughput, given a limited fronthaul capacity.
Upgrading the network infrastructure by replacing copper cabling with fiber cabling for fronthaul connections has also been considered as an alternative solution to provide highcapacity fronthaul [22]. However, this approach suffers from poor flexibility and high cost in largescale deployments [24, 23]. Laying optical fiber to connect the RRUs to their host BBU is impossible in city centres of some countries [25]. Thus wireless backhauling/fronthauling [26, 27, 28, 29, 30], relying on massive MIMO [4] and mmWave [5, 30], has recently emerged as a promising solution for 5G networks due to its flexibility and costefficiency [31, 23, 32]. Park et al. [31] proposed a partially centralized CRAN based on massive MIMO schemes, while Parsaeefard et al. [32] allocated resources by appropriately adjusting the parameters of the RRU, BBU and fronthaul as well as the power allocated to UDs.
IB Our Contributions
A UD’s achievable throughput inherently depends on both the UDtoRRU links and the RRUtoBBU links. Thus all the UDtoRRU links and RRUtoBBU links should be considered together in order to maximize a CRAN’s capacity. By exploiting the CRAN’s superior centralized signal processing/resource control capability over both the UDtoRRU links and the RRUtoBBU links, we propose to boost the fronthaul capacity by globally optimizing the power sharing for both the RRUs and UDs located within a BBU’s service coverage. Intuitively, allocating more power to channel estimation will result in more accurate channel estimates, which increases the achievable throughput. But increasing the power allocated to pilot training will reduce the power allocated to data transmission, which reduces the achievable data throughput. This paper addresses for the first time how to optimize the powers allocated to pilot training and to data transmission both for the UDs and RRUs. The main contributions of this paper are as follows.

We investigate the CRAN’s configuration, which employs large numbers of antennas both at the RRUs and their host BBU. We formulate the ultimately achievable uplink sumrate of the CRAN as a function of the signaltointerferenceplusnoise ratio (SINR) by considering both the UDtoRRU links and the RRUtoBBU links. Furthermore, we derive a closedform expression of the asymptotic achievable uplink sumrate in the presence of realistic channel estimation errors both for the UDtoRRU links and the RRUtoBBU links.

We propose to boost the uplink fronthaul capacity by globally optimizing the power sharing between the pilots and data transmission both for RRUs and for the UDs within a BBU’s service coverage. Specifically, given the power of a UD, a UD’s power sharing factor controls the specific fractions of power allocated to the UD’s uplink pilot and to the UD’s uplink data transmission, respectively. Similarly, given the power of a RRU, the power allocated to the RRU’s uplink pilot and the power allocated to the RRU’s uplink data forwarding are controlled by the RRU’s power sharing factor. We formulate the uplink sumrate as a function of the all the UDs’ power sharing factors and all the RRUs’ power sharing factors. We then maximize this uplink sumrate by invoking the global optimization algorithm of the differential evolution algorithm (DEA).
IC Notations
Throughout our discussions, and
stand for the transpose and Hermitian transpose of vector/matrix, respectively.
is the zero vector, which is abbreviated as , and denotes the zero matrix, while following the convention, the identity matrix is represented by . The the expectation operation is represented by and denotes the column stacking operation applied to matrix . The diagonal matrix with at its diagonal entries is denoted by and the block diagonal matrix has as its block diagonal entries, while denotes the Kronecker product and is the trace operator. Furthermore, and denote the th row and th column of , respectively. Furthermore, the subscripts tx and rx indicate that the variable considered is at the transmitter and the variable considered is at the receiver, respectively. The superscript denotes a variable between the UDs and the RRU, and the superscript denotes a variable between RRUs and the BBU, respectively. The superscripts and are variables related with UD’s data transmission and pilot training, and the superscripts and are variables related with RRU’s data transmission and pilot training, respectively.For easy reference, below we list the key mathematical symbols used in the manuscript.

[leftmargin=0.6in]

Total number of UDs.

Number of RRUs.

Number of UDs served by the th RRU.

Number of RAs equipped by RRU.

Number of RAs equipped by BBU.

Total power of an UD.

Total power of a RRU.

Receiver’s efficiency factor of the th RRU.

Received signal processing power of the th RRU.

Pathloss.

Distance between the th UD and the th RRU.

Distance between the th RRU and its host BBU.

Normalization factor at the th RRU.

Normalization factor at the BBU.

Power sharing factor of the th UD.

Power sharing factor of the TA for forwarding the th UD’s data.

Transmit power of the th UD, for pilot and for data.

Transmit power of the th TA at the th RRU, for pilot and for data.

Receive signal power of the th UD at the th RRU, for pilot and for data.

Receive signal powers of the th RRU’s UDs at the th RRU, for pilot and for data.

Receive signal powers at the BBU.

Transmit signals of the UDs served by the th RRU.

Uplink MIMO channel between the th RRU and its UDs.

Interfering channel between the th RRU’s UDs and the th RRU.

Signals received at the th RRU.

AWGN vector at the th RRU.

Receiver combining matrix used by the th RRU.

Uplink signals after combining at the th RRU.

Noise output after the th RRU’s combining.

Power amplification at the th RRU.

Transmit signals consisting of all the signals transmitted by all the RRUs.

Rician factor.

MIMO Rician channel between the forwarding TAs of all the RRUs and their host BBU.

Deterministic part of the Rician channel.

Scattered component of the Rician channel .

Signals received at the BBU.

AWGN vector at the BBU.

BBU’s combining matrix.

Noise output after BBU’s combining.

Uplink signals after combining at BBU.
Additionally, the main abbreviations used are listed below.

[leftmargin=0.6in]

Additive white Gaussian noise

Baseband unit

Centralized radio access network

Channel state information

Differential evolution algorithm

Independently and identically distributed

Lineofsign

Matched filter

Millimeterwave

Receive antenna

Remote radio unit

Signal to interference plus noise ratio

Transmit antenna

User device
The rest of this paper is organized as follows. Section II describes the massive MIMO aided uplink CRAN architecture, and Section III derives its closedform achievable asymptotic uplink sumrate. The global optimization metric as a function of all the power sharing factors and how to maximize it are presented in Section IV. Our simulation results are presented in Section V for demonstrating the efficiency of the proposed approach, whilst our conclusions are given in Section VI.
Ii System Model
Consider an uplink CRAN architecture as illustrated in Fig. 1. Each RRU is connected to its host BBU via a wireless fronthaul link. Assume that the frequency is reused within the coverage of a BBU, while orthogonal access is adopted by the different BBUs. Thus there is no interference between the UDs located in the different BBUs’ coverage areas. Hence we only have to consider a single BBU’s coverage. The BBU employs receive antennas (RAs) to serve RRUs, while each RRU is equipped with RAs. The th RRU employs transmit antennas (TAs) and serves singleTA UDs, where . The total number of UDs within the BBU’s coverage is
. According to the wellknown spatial multiplexing gaining or spatial degree of freedom, the number of independent data streams supported cannot be higher than the number of receive antennas. The spatial degree of freedom between the
th RRU and its supporting UDs is given by [33, 34]. Therefore, the number of UDs served by the th RRU is no more than the number of the th RRU’s RAs, i.e., . We further assume that for . Since the th RRU uses TAs for forwarding its serving UDs’ data to the BBU, the total number of TAs for all the RRUs is , which equals to the number of uplink fronthaul streams. Clearly, is no more than the number of the BBU’s RAs, i.e., .Iia Power Consumption Preliminaries
For a fair power allocation, all the UDs have the same total power , which is shared by pilot training and data transmission of each UD. Explicitly, for the link of the th UD to the th RRU, the power allocation between pilot training and data transmission is controlled by a power sharing factor . Let the power of pilot training be and that of data transmission be , respectively, for the th UD served by the th RRU. Then,
(1)  
(2) 
For convenience, we introduce the new symbol with indicating pilot training and indicating data transmission. At the th RRU, the received signal power of the th UD is given by
(3) 
where the pathloss is given by [35]
(4) 
in which [m] is the distance between the th UD and its host, the th RRU, and [Hz] is the carrier frequency. For the 5G system, GHz has been allocated in United Kingdom (UK) [36], which will be considered in our investigations.
Each RRU is allocated with the same total power for the sake of fairness, which is shared by the received signal processing of detecting the served UDs’ data as well as pilots and data transmission for forwarding these UDs’ data to the BBU. Intuitively, the power consumed by the th RRU’s received signal processing is related to the uplink sumrate of its serving UDs, which is dominated by the uplink transmit power. Since the accurate modeling of this received signal processing is absent in the literature, we approximately model the th RRU’s received signal processing power consumption as a function of its serving UDs’ uplink transmit power by
(5) 
where is the receiver’s efficiency factor of the th RRU. The remaining power of the th RRU is shared by its TAs for pilots and data transmission. Let be the power sharing factor of the TA for forwarding the th UD’s data. Then, the pilot training power and data transmission power for the th TA are given by
(6)  
(7) 
At the BBU, the received signal power is related to the transmit signal power by a similar pathloss model . Since mmWave communication is established between the RRUs and their host BBU, the pathloss of the RRUtoBBU link is given by [37, 38]
(8) 
where [m] is the distance between the th RRU and its host BBU, while [GHz] is the carrier frequency. For the mmWave based 5G system in the UK, GHz is allocated [36], which is used for our investigations.
(13) 
IiB Signal Model of UDtoRRU
Obviously the pilot training and data transmission have the same signal model. Again, we introduce the symbol , with indicating the pilot training and representing data transmission, respectively. Then, the signals received at the th RRU can be expressed generically as
(9) 
where is the uplink MIMO channel between the th RRU and its UDs, is the transmit signal vector with for , and is the additive white Gaussian noise (AWGN) vector with the distribution . Still referring to (9), and are the received powers of the UDs’ signals at the th RRU, while is the interfering channel matrix between the th RRU’s UDs and the th RRU, and are the received powers of the th RRU’s UDs at the th RRU. Since all the UDs’ signals suffer from the same noise at the RRU, the noise power at all the UDtoRRU links are identical. Furthermore, the second term in (9) represents the interference imposed by the UDs of the adjacent RRUs.
Because the UDs are randomly distributed in the RRUs’ coverage areas and there are many obstructions between the UDs and their host RRU, the direct lineofsight (LoS) paths may always be blocked. Hence, the channels between the UDs and their host RRU are Rayleigh channels, and the UDtoRRU MIMO channel matrix can be expressed as
(10) 
where is the spatial correlation matrix of the th RRU’s RAs and is the spatial correlation matrix of the UDs, while has the independently identically distributed (i.i.d.) complex entries and each of them has the distribution of . Because the UDs are randomly distributed and they are independent of each other, there is no correlation between the TAs of different UDs and we have .
Let the receiver combining matrix used by the th RRU be . The uplink signals after combining at the th RRU can be expressed as
(11) 
where is the normalization factor, and is the effective noise vector having the distribution of with the covariance matrix . When a matchedfilter (MF) is used for uplink combining, we have
(12) 
where is the estimate of the uplink channel .
The optimal minimum mean square error (MMSE) channel estimator [39] is given by (13) at the top of this page^{1}^{1}1We assume that the pilot contamination imposed by the UDs served by the adjacent RRUs has been eliminated by optimal pilot design [40, 41]., where is the received signal matrix over the pilot symbols, and is the pilot symbol matrix with , while are the received powers of the pilot symbols. Furthermore, in (13), denotes the covariance matrix of , which is given by
(14) 
The MMSE estimate (13) follows the distribution [39]
(15) 
with the covariance matrix given by
(16) 
The relationship between the MMSE channel estimate and the true channel is given by
(17) 
where the channel estimation error is statistically independent of both and . Moreover, the distribution of is
(18) 
with the covariance matrix given by
(19) 
IiC Signal Model of RRUtoBBU
The UDs’ data signals received by their host RRUs are forwarded to the BBU after power amplification. Ideally, the th UD’s data received by the th RRU is scaled to its transmit signal constellation by a power amplification coefficient . Hence, the power amplification at the th RRU is represented by the diagonal matrix given by
(20) 
and the th RRU forwards the amplified signal to its host BBU via the fronthaul links. Explicitly, let be the transmit signal vector consisting of all the signals transmitted by the RRUs, which is given by
(21) 
(36) 
(46) 
(47) 
where we have
(22)  
(23)  
(24)  
(25) 
(26)  
(27)  
(28)  
(29)  
(30) 
In (29), , , depends on the value of , which is given by
(31) 
Similarly, in (30), if , is as it is, while if , .
The signals received at the BBU are expressed as
(32) 
where is the AWGN vector, are the received signal powers at the BBU, and is the MIMO channel matrix between the forwardingTAs of the RRUs and their host BBU. The RRUs are generally stationary and are carefully positioned, so that the direct LoS paths always exist between the RRUs and their host BBU. Thus the channels between the RRUs and their host BBU are Rician channels and, therefore, the MIMO channel matrix is given by
(33) 
where is the deterministic part of the Rician channel satisfying , and with being the Rician factor, while is the scattered component of the Rician channel in which is the spatial correlation matrix of the RAs at the BBU, is the spatial correlation matrix of the forwarding TAs of the RRUs, and has i.i.d. complex entries and each of them has the distribution . The th RRU has a total of antennas, which is much more than , and it can always select its forwarding TAs to be spaced sufficiently far apart. Consequently, the correlations between the TAs can be assumed to be zero. Furthermore, there exists no correlation between the antennas of different RRUs. Thus we can assume that .
The uplink signals after combining can be expressed as
(34) 
where is the BBU’s combining matrix, , and is the noise output after the combining, which obeys the distribution with the covariance matrix . Again, the MF is adopted by the BBU and is given by
(35) 
where is the estimate of the uplink channel .
The MMSE channel estimate [39] is given by (36) at the top of this page, where is the received signal matrix with removing the LoS component over the pilot symbols before the receiver combining, and is the pilot symbol matrix with , while are the received powers of the pilot symbols. In (36), is the covariance matrix of and it is given by
(37) 
The distribution of the MMSE estimate is [39]