Distributed Precoding Design via Over-the-Air Signaling for Cell-Free Massive MIMO

04/01/2020 ∙ by Italo Atzeni, et al. ∙ University of Oulu 0

Most works on cell-free massive multiple-input multiple-output (MIMO) consider non-cooperative precoding strategies at the base stations (BSs) to avoid extensive channel state information (CSI) exchange via backhaul signaling. However, considerable performance gains can be accomplished by allowing coordination among the BSs. This paper proposes the first distributed framework for cooperative precoding design in cell-free massive MIMO (and, more generally, in joint transmission coordinated multi-point) systems that entirely eliminates the need for backhaul signaling for CSI exchange. A novel over-the-air (OTA) signaling mechanism is introduced such that each BS can obtain the same cross-term information that is traditionally exchanged among the BSs via backhaul signaling. The proposed distributed precoding design enjoys desirable flexibility and scalability properties, as the amount of OTA signaling does not scale with the number of BSs or user equipments. Numerical results show fast convergence and remarkable performance gains as compared with non-cooperative precoding design. The proposed scheme may also outperform the centralized precoding design under realistic CSI acquisition.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Massive multiple-input multiple-output (MIMO) and joint transmission coordinated multi-point (JT-CoMP) are two of the physical-layer wireless technologies that have attracted the most attention during the past ten years. In massive MIMO networks, each base station (BS) is equipped with a large number of antenna elements and serves numerous user equipments (UEs) simultaneously by means of highly directional beamforming techniques [Mar10, Bjo18]. On the other hand, JT-CoMP enables coherent transmission from clusters of BSs to overcome the inter-cell interference within each cluster [Ges10, Jun14]. While the upcoming 3GPP New Radio (NR) standard for 5G will have massive MIMO as one of its cornerstones [Eri18], it will not include JT-CoMP (at least in its first releases) as its implementation in the Long-Term Evolution-Advanced (LTE-A) standard [Lee12] did not achieve significant gains in practice. This can be mainly attributed to the considerable amount of backhaul signaling required for channel state information (CSI) and data sharing [Bas17] but also to a network-centric approach to coherent transmission [Int19]

, whereby the BSs in a cluster cooperate to serve the UEs in their joint coverage region. The practical implementation of JT-CoMP was also hindered by other attributes of LTE-A, such as a frequency division duplex dominated macro-cell deployment and a rigid frame/slot structure in its time division duplex (TDD) mode of operation, which did not allow for a flexible channel estimation.

Cell-free massive MIMO [Int19, Zha19] is a recently coined concept that conveniently combines elements from massive MIMO [Bjo18], small cells [Jun14], and UE-centric JT-CoMP [Buz20]. In UE-centric coherent transmission, the clusters of BSs are formed so that each UE is served by its closest BSs, and each BS may cooperate with different sets of BSs when serving different UEs. In a cell-free context, the massive MIMO regime is achieved by spreading a large number of antenna elements across the network (even in the form of single-antenna BSs [Ngo17, Nay17]), which provides enhanced coverage and reduced pathloss. Moreover, a UE-centric coherent transmission extended to the whole network, where each UE is served jointly by all the BSs, ideally allows to entirely eliminate the inter-cell interference. To this end, all the BSs are assumed to be connected to a central processing unit (CPU) by means of backhaul links, which together provide the UE-specific data and, possibly, enable network-wide processing for the computation of the precoding strategies.

Cell-free massive MIMO has been the subject of an extensive literature over the past few years and is now regarded as a potential physical-layer paradigm shift for beyond-5G systems [Zha20]. Remarkably, cell-free massive MIMO networks have been shown to outperform traditional cellular massive MIMO and small-cell networks in several practical scenarios [Nay17, Ngo17, Bjo20]. Their performance has been analyzed under several realistic network and hardware assumptions, e.g., with hybrid analog-digital precoding [Alo19], with low-resolution analog-to-digital converters [Hu19, Zha19a], under channel non-reciprocity [Pal20], as well as with hardware impairments and limited backhaul capacity [Zha18, Mas19, Fem19]. Another important focus is the global energy efficiency, which has been studied considering the impact of backhaul power consumption [Ngo18] and quantization [Bas19a] among other factors. To avoid CSI exchange among the BSs via backhaul signaling and to reduce the overall computational complexity, most of the aforementioned works assume simple non-cooperative precoding strategies at the BSs, such as maximum ratio transmission (MRT), local zero-forcing (ZF) precoding, and local minimum mean squared error (MMSE) precoding, which can be implemented based on locally acquired CSI (see also [Int19b]). However, the performance of cell-free massive MIMO systems can be considerably improved by increasing the level of coordination among the BSs [Bjo20].

Cooperative precoding design for JT-CoMP can be broadly classified into centralized and distributed approaches. In the centralized precoding design, the BSs forward their locally acquired CSI to the CPU via backhaul signaling and the CPU feeds back the optimized precoding strategies to the BSs. Here, both the amount of CSI exchange between the BSs and the CPU and the computational complexity involved in the precoding optimization at the CPU may be overwhelming due to the high dimensionality of the aggregated channels. To avoid the centralized computation,

[Kal18] proposed a distributed iterative framework for JT-CoMP that allows to optimize the precoding strategies locally at each BS using bi-directional training between the BSs and the UEs [Tol19] in addition to periodic exchange of cross-term information among nearby BSs via backhaul signaling. Despite a significant complexity reduction, the extensive CSI exchange among the BSs makes the practical implementation of [Kal18] challenging; furthermore, the backhaul introduces delays and quantization errors into the CSI exchange that can sensibly degrade the performance of the precoding design. These issues are particularly critical in a cell-free massive MIMO context due to the large number of BSs and UEs involved in the joint processing.

I-a Contribution

Non-cooperative precoding strategies (such as MRT, local ZF precoding, and local MMSE precoding) have been so far preferred in the cell-free massive MIMO literature as they do not require any CSI exchange via backhaul signaling. However, the fact that the channel hardening effect is less pronounced in cell-free massive MIMO than in cellular massive MIMO [Int19] suggests that cooperative precoding design can bring considerable performance gains over its non-cooperative counterpart. In this paper, we bridge this gap and propose the first distributed framework for cooperative precoding design in cell-free massive MIMO (and, more generally, in JT-CoMP) systems that entirely eliminates the need for backhaul signaling for CSI exchange. Focusing on the weighted sum mean squared error (MSE) minimization, a novel over-the-air (OTA) signaling mechanism allows each BS to obtain the same cross-term information that was exchanged among the BSs via backhaul signaling in [Kal18]. Specifically, this is achieved by introducing a new uplink signaling resource and a new CSI combining mechanism that complement the existing uplink and downlink pilot-aided channel estimations. The proposed distributed precoding design enjoys desirable flexibility and scalability properties, as the amount of OTA signaling does not scale with the number of BSs or UEs; furthermore, there are no delays in the CSI exchange among the BSs. These practical benefits come at the cost of extra uplink signaling overhead per bi-directional training iteration, which, however, results in a minor performance loss with respect to the distributed precoding design via backhaul signaling.

The contributions of this paper are summarized as follows:

  • [leftmargin=6mm]

  • Building on existing tools from JT-CoMP and considering multi-antenna UEs, we describe centralized and distributed precoding schemes for cell-free massive MIMO under both perfect CSI and realistic pilot-aided CSI acquisition.

  • We propose a distributed precoding design where the CSI exchange among the BSs via backhaul signaling in [Kal18] is entirely replaced by a novel OTA signaling mechanism, which does not scale with the number of BSs or UEs.

  • We address relevant implementation aspects of the proposed distributed precoding design and illustrate how the OTA signaling can be integrated into the flexible 5G 3GPP NR frame/slot structure [3GPP_38.211].

  • Numerical results show significant performance gains in terms of average sum rate over non-cooperative precoding design even after a small number of iterations; remarkably, the proposed distributed precoding design via OTA signaling outperforms its centralized counterpart in presence of imperfect CSI and the huge practical benefits with respect to the case with ideal backhaul signaling come at the cost of a very modest performance loss.

Outline. The rest of the paper is structured as follows. Section II introduces the cell-free massive MIMO system model. Section III describes the centralized and the distributed precoding design with perfect CSI. Then, Section IV extends the previous section by considering realistic pilot-aided CSI acquisition. As the main contribution of this paper, Section V presents the distributed precoding design via OTA signaling. In Section VI, numerical results are reported to illustrate the remarkable performance of the proposed scheme in different practical scenarios. Finally, Section VII summarizes our contributions and draws some concluding remarks.

Notation.

Lowercase and uppercase boldface letters denote vectors and matrices, respectively, whereas

and are the transpose and Hermitian transpose operators, respectively. and represent the Euclidean norm for vectors and the Frobenius norm for matrices, respectively. and are the real part and expectation operators, respectively. denotes the

-dimensional identity matrix and

represents the zero vector or matrix with proper dimension. is the trace operator and produces a diagonal matrix with the elements of the vector argument on the diagonal. denotes horizontal concatenation, whereas or denote the set of elements in the argument. Lastly,

is the complex normal distribution with zero mean and variance

, whereas denotes the gradient with respect to .

Ii System Model

Consider a downlink cell-free massive MIMO network where a set of BSs , each equipped with antennas, serves a set of UEs , each equipped with antennas. Assuming a TDD setting and, for simplicity, a single data stream per UE, let be the uplink channel matrix between UE  and BS , with denoting the aggregated uplink channel matrix of UE . Likewise, let be the BS-specific precoding vector used by BS  for UE , with denoting the aggregated precoding vector used for UE ; here, we assume the per-BS power constraints , where denotes the maximum transmit power at each BS. Note that, according to the previous definitions, we have . Hence, the receive signal at UE  is given by

(1)

where is the transmit data symbol for UE  and is the average white Gaussian noise (AWGN) term at UE  with elements distributed as . Upon receiving , UE  uses the combining vector to combine and the resulting signal-to-interference-plus-noise ratio (SINR) reads as

(2)

From (2), it is easy to observe that the design of the precoding vectors depends on the combining vectors and vice versa. Finally, the sum rate (measured in bps/Hz) is given by

(3)

This paper focuses on distributed precoding design, where each BS  optimizes its precoding vectors locally while coordinating with the other BSs. For the sake of comparison, we also illustrate the centralized precoding design, where the aggregated precoding vectors are optimized by the CPU and the BS-specific precoding vectors are fed back to the BSs. In both cases, the combining vectors are computed locally by the corresponding UEs. In the following, we describe realistic pilot-aided CSI acquisition at both the BSs (in Section II-A) and the UEs (in Section II-B), which will be heavily referred to in Sections IV and V.

Ii-a Uplink Pilot-Aided Channel Estimation

Let be the effective uplink channel vector between UE  and BS , and let be the pilot sequence assigned to UE , with . Moreover, let denote the maximum transmit power at each UE. In the uplink pilot-aided channel estimation phase, each UE  synchronously transmits its pilot sequence using its combining vector as precoder, i.e.,

(4)

where the power scaling factor (equal for all the UEs) ensures that complies with the UE transmit power constraint (see Section V-B for more details on the choice of ). Then, the receive signal at BS  is given by

(5)
(6)

where is the AWGN term at BS  with elements distributed as , and the least-squares (LS) estimate of is obtained as

(7)
(8)

Here, perfect channel estimation is achieved when:

  • [leftmargin=6mm]

  • The pilot contamination in the second term of (8) is eliminated using, for instance, orthogonal pilots (i.e., ) or non-orthogonal random pilots with infinite pilot length (i.e., );

  • The channel estimation noise in the third term of (8) is eliminated using infinite pilot length.

Note that these observations also apply to (13) and (19) in the following.

On the other hand, the estimation of the channel matrix requires antenna-specific pilot sequences for UE . In this context, let be the pilot matrix assigned to UE , with . In the uplink pilot-aided channel estimation phase, each UE  synchronously transmits its pilot matrix, i.e.,

(9)

where the power scaling factor ensures that complies with the UE transmit power constraint. Then, the receive signal at BS  is given by

(10)
(11)

where is the AWGN term at BS  with elements distributed as , and the LS estimate of is obtained as

(12)
(13)
(14)

where (14) holds only if (i.e., if there is no pilot contamination among the columns of ).

Remark 1.

The expressions of the receive signals in (6) and (11) imply that the transmit signals from all the UEs are received synchronously by each BS. Although perfect synchronization is infeasible, quasi-synchronous operations can be achieved in practice by setting the duration of the cyclic prefix to accommodate both the synchronization errors and the delay spread, as described in [Int19, Bjo20]. These considerations also apply to the downlink pilot-aided channel estimation in Section II-B (see (17)) and to the new uplink signaling resource introduced in Section V (see (52)).

Ii-B Downlink Pilot-Aided Channel Estimation

Let be the effective downlink channel vector between all the

BSs and UE . In the downlink pilot-aided channel estimation phase (see [Int19a]), each BS  synchronously transmits a superposition of the pilot sequences after precoding them with the corresponding precoding vectors , i.e.,

(15)

Then, the receive signal at UE  is given by

(16)
(17)

where is the AWGN term at UE  with elements distributed as , and the LS estimate of is obtained as

(18)
(19)

Iii Problem Formulation with Perfect CSI

In this paper, we target the weighted sum MSE minimization problem to optimize the precoding vectors and the combining vectors . This can be used as a surrogate of the more involved weighted sum rate maximization problem (or, equivalently, of the iterative weighted sum MSE minimization problem [Shi11]). In fact, since the total number of BS antennas in the network  is much larger than the number of UEs , the weighted sum MSE minimization yields only a minor penalty in terms of sum-rate performance as compared with the weighted sum rate maximization, while being much easier to handle and providing an inherent fairness across the UEs. In this section, we tackle the weighted sum MSE minimization problem under perfect channel estimation; the results derived here will be highly useful to describe the case of realistic pilot-aided CSI acquisition at both the BSs and the UEs in Sections IV and V.

Building on (1), let us introduce the MSE at UE  as

(20)
(21)

which is convex with respect to either the transmit or the receive strategies, but not jointly convex with respect to both. This makes the joint optimization of the precoding and the combining vectors extremely challenging, especially under limited signaling between the BSs and the UEs. Hence, we use alternating optimization, whereby the precoding vectors are optimized for fixed combining vectors and vice versa in an iterative best-response fashion (as done, e.g., in [Shi11, Kal18]).

  • [leftmargin=6mm]

  • Optimization of the combining strategies. The combining vectors are computed locally and independently by the UEs such that each UE  minimizes in (21). In the centralized precoding design, the combining vectors are also derived by the CPU in conjunction with the precoding vectors as part of the alternating optimization routine (although they are not fed back to the UEs). From the point of view of UE , we can rewrite the MSE as

    (22)

    where we have defined

    (23)

    The combining vector that minimizes (22) is the well-known MMSE receiver, which may be written as

    (24)

    Observe that can be computed locally by UE  as in (24) if in (23) and the effective downlink channel are known by UE .

  • Optimization of the precoding strategies. The precoding vectors are computed as the solutions of the weighted sum MSE minimization problem with per-BS power constraints, where is the weight assigned to UE . To this end, we introduce the following preliminary definitions: , , , , and , where the latter may be rewritten as

    (25)

    with . Finally, the weighted sum MSE can be expressed as

    (26)

    In the following, we first describe the centralized precoding design in Section III-A and then focus on the distributed precoding design via backhaul signaling in Section III-B.

Iii-a Centralized Precoding Design

In the centralized precoding design, the aggregated precoding vectors are computed by the CPU and the BS-specific precoding vectors are fed back to the corresponding BSs via backhaul signaling. Here, the alternating optimization of the precoding and the combining vectors takes place transparently at the CPU. Hence, for fixed combining vectors, the CPU solves the weighted sum MSE minimization problem

(27)

where is a selection matrix such that . For each UE , the first-order optimality condition of (27) reads as

(28)

where are the (coupled) dual variables related with the per-BS power constraints, which can be optimized, e.g., using the ellipsoid method. Finally, (28) yields the centralized precoding solution

(29)
(30)

The centralized precoding design is carried out as follows. First, each BS  acquires the channel matrices and forwards them to the CPU via backhaul signaling. Then, the CPU computes the aggregated precoding vectors as in (29) together with the combining vectors as in (24) by means of alternating optimization. Subsequently, it feeds back the BS-specific precoding vectors to each BS  via backhaul signaling. Lastly, each UE  acquires in (23) and the effective downlink channel , based on which it computes its combining vector as in (24).

Iii-B Distributed Precoding Design via Backhaul Signaling

In the distributed precoding design, the BS-specific precoding vectors are computed locally by the BSs. Here, the alternating optimization of the precoding and the combining vectors takes place by means of iterative bi-directional training between the BSs and the UEs (see [Shi14, Kal18, Jay18, Tol19]). Hence, for fixed combining vectors, the BSs solve the weighted sum MSE minimization problem

(31)

For each BS  and for each UE , the first-order optimality condition of (31) reads as

(32)

where has the same meaning as in (28) and can be optimized via bisection methods. Finally, (32) yields the distributed precoding solution111The equivalence between the centralized and the distributed precoding solutions in (29) and (33), respectively, is shown in Appendix A for the simple case of  BSs, which can be extended to any value of by recursively applying the Schur complement.

(33)

where we have defined

(34)

Recall that the computation of by BS  requires the optimization of the dual variable via bisection methods.222Observe that implies . Building on the parallel optimization framework proposed in [Scu14], the distributed precoding design can be implemented in an iterative best-response fashion [Kal18]. Focusing on UE , at each iteration , each BS  locally computes as in (33) in parallel with the other BSs for a fixed (and, thus, for fixed ); then, each BS  updates its precoding vector as

(35)

with . In this context, the update in (35) is necessary to limit the variation of the precoding vectors between consecutive iterations, where the step size must be chosen to strike the proper balance between convergence speed and accuracy. We refer to [Kal18, Scu14] for more details on the choice of and on the convergence properties.

Remark 2.

The vector in (34) contains implicit information about the channel correlation between BS  and the other BSs and about the precoding vectors adopted by the latter for UE . The knowledge of such cross-term information at each BS  is required to iteratively adjust the distributed precoding solution so that it converges to its centralized counterpart described in Section III-A. In this regard, omitting from (33) yields the highly suboptimal local MMSE precoding. Note that, while the effective uplink channels (which are also used to compute ) can be acquired locally by each BS  via uplink training, the acquisition of calls for extensive CSI exchange among the BSs via backhaul signaling [Kal18]. In Section V, we propose a practical scheme to implement the distributed precoding design that relies solely on OTA signaling.

Remark 3.

The computational complexity associated with the distributed precoding design is , with being the number of bisection steps per iteration; here, the term follows from the -dimensional matrix inversion in (33). On the other hand, the computational complexity associated with the centralized precoding design described in Section III-A is , where the term follows from the -dimensional matrix inversion in (29). Hence, despite its iterative nature, the distributed precoding design brings a substantial computational complexity reduction as the total number of BS antennas in the network  is usually very large in cell-free massive MIMO contexts.

The distributed precoding design via backhaul signaling is carried out as follows. First, for fixed combining vectors , each BS  acquires the effective uplink channels and, by means of backhaul signaling, the vectors . Then, it computes its precoding vectors locally as in (33) and it updates them as in (35). Subsequently, each UE  acquires in (23) and the effective downlink channel , based on which it computes its combining vector locally as in (24). This process is iterated until a predefined termination criterion is satisfied.

Iv Problem Formulation with Imperfect CSI

In this section, we consider the centralized and the distributed precoding designs described in Sections III-A and III-B under realistic pilot-aided CSI acquisition at both the BSs and the UEs (see Sections II-A and II-B). Here, the precoding vectors and the combining vectors are computed as the solutions of an estimated weighted sum MSE minimization problem with per-BS power constraints. For notational simplicity, and without loss of generality, we assume .

Iv-a Centralized Precoding Design

In the centralized precoding design, the CPU computes the combining vectors and the aggregate precoding vectors for each UE  as

(36)
(37)

respectively, as part of the alternating optimization routine. Here, (36) and (37) are obtained from minimizing in (21) after replacing the channels with the estimated channels (obtained as in (13)), and are equal to (24) and (29), respectively, for perfect channel estimation. The implementation of the centralized precoding design is formalized in Algorithm 1. Note that such scheme is highly susceptible to imperfect channel estimation as it hinges on a single pilot-aided CSI acquisition (see Remark 5).

   Data: Pilot matrices and pilot sequences ( can be the first column of ).
  • [leftmargin=12mm]

  • UL: Each UE  transmits the pilot matrix (see in (9)); each BS  receives  in (11).

  • Each BS  obtains as in (13) and forwards them to the CPU via backhaul signaling.

  • The CPU computes the aggregated precoding vectors as in (37) together with the combining vectors as in (36) by means of alternating optimization.

  • The CPU feeds back the BS-specific precoding vectors to each BS  via backhaul signaling.

  • DL: Each BS  transmits a superposition of the pilot sequences after precoding them with the corresponding precoding vectors (see in (15)); each UE  receives in (17).

  • Each UE  computes its combining vector as in (41).

Algorithm 1 (Centralized)

Iv-B Distributed Precoding Design via Backhaul Signaling

In the distributed precoding design, after the downlink pilot-aided channel estimation phase, each UE  obtains

(38)

with and defined in (17) and (23), respectively, and

(39)

Here, perfect channel estimation would imply that:

  • [leftmargin=6mm]

  • The pilot contamination in the second term of (38) is eliminated;

  • As , we have that .

Hence, UE  can use (38) as an estimate of and, consequently, it can obtain an estimate of in (22) as

(40)

Finally, each UE  can compute its combining vector locally as

(41)

which is equal to (24) for perfect channel estimation.

On the other hand, for the computation of the precoding vectors, let us define and . The following steps describe how the cross-term information can be expressed in terms of the receive signals at the BSs in the uplink pilot-aided channel estimation phase. For each BS pair and , we have the following relation:

(42)

with defined in (6) and

(43)

Note that is not available at BS . Here, perfect channel estimation would imply that:

  • [leftmargin=6mm]

  • The pilot contamination in the second term of (42) is eliminated;

  • As , we have that if and .

Hence, (42) can be intended as an estimate of if or of if and, consequently, can be intended as an estimate of . This can be exploited to write the estimated sum MSE as

(44)

where the term removes the noise bias from the estimation of . For fixed combining vectors, the BSs solve the estimated sum MSE minimization problem

(45)

Finally, for each BS  and for each UE , the first-order optimality condition of (45) yields the distributed precoding solution

(46)

which is equal to (33) for perfect channel estimation, and where the term in the inverse matrix removes the noise bias from the estimation of (the same holds for (49), (55), and (56)). To compute as in (46), BS  needs to acquire the term from each BS  via backhaul signaling, as described in [Kal18]. The iterative implementation of the distributed precoding design via backhaul signaling is formalized in Algorithm 2. Here, suitable termination criteria can be, for instance, , where is the maximum number of iterations (fixed to comply with some latency constraints or adapted to the duration of the scheduling block), , or . These observations also apply to Algorithms 3 and 4 in the following.

   Data: Pilot sequences .
   Initialization: Each BS  initializes its precoding vectors ; set .
   Until a predefined termination criterion is satisfied, do:
  • [leftmargin=12mm]

  • .

  • DL: Each BS  transmits a superposition of the pilot sequences after precoding them with the corresponding precoding vectors (see in (15)); each UE  receives in (17).

  • Each UE  computes its combining vector as in (41).

  • UL-1: Each UE  transmits its pilot sequence after precoding it with its combining vector (see in (4)); each BS  receives in (6).

  • For each UE , each BS  acquires from the other BSs via backhaul signaling.

  • For each UE , each BS  computes its precoding vectors as in (46) and updates them as in (35).

   End
Algorithm 2 (Distributed–backhaul)
Remark 4.

The amount of backhaul signaling for CSI exchange in step S.4 of Algorithm 2 scales not only with the pilot length and the number of bi-directional training iterations, but also with the number of BSs  and the number of UEs  since the cross terms are specific for each BS-UE pair. This becomes burdensome in cell-free massive MIMO contexts due to the high number of BSs and UEs involved in the joint processing. In addition, the CSI exchange among the BSs via backhaul signaling does not occur instantaneously.333Without loss of generality, one can express the delay introduced by the backhaul into the CSI exchange in terms of number of bi-directional training iterations. In our numerical results in Section VI, we assume that such delay amounts to one bi-directional training iteration. Therefore, each BS must rely on outdated CSI from the other BSs, which can significantly degrade the performance of the distributed precoding design (as demonstrated in [Kal18]). In Section V, we propose a practical scheme that allows each BS to acquire the missing cross-term information via OTA signaling, which entirely eliminates the need for backhaul signaling for CSI exchange among the BSs.

Remark 5.

As detailed in [Kal18], using (42) as a surrogate of provides improved robustness against pilot contamination with respect to estimating each UE channel explicitly. Consequently, the distributed precoding design is less sensitive to pilot contamination than the centralized precoding design described in Section IV-A. Even in absence of pilot contamination, due to its iterative nature that involves several pilot-aided CSI acquisitions, the distributed precoding design is more robust to noisy channel estimation than its centralized counterpart (which hinges on a single pilot-aided CSI acquisition). In this regard, it is straightforward to observe that the precoding vector update at iteration , i.e., defined in (35), can be expressed as a weighted average of precoding vectors computed as in (46) based on as many channel estimations with independent AWGN realizations. Hence, the update in (35) produces a beneficial averaging of the channel estimation noise that reflects positively on the sum-rate performance. The robustness of the distributed precoding design against both pilot contamination and channel estimation noise is highlighted in our numerical results in Section VI.

For comparative purposes, in the next section, we present a centralized precoding design with iterative bi-directional training between the BSs (which communicate with the CPU via backhaul signaling) and the UEs. Similarly to the distributed precoding design, this scheme involves pilot-aided CSI acquisitions at each bi-directional training iteration and thus overcomes the main drawback of the centralized precoding design described in Section IV-A.

Iv-C Centralized Precoding Design with Iterative Bi-Directional Training

In the centralized precoding design with iterative bi-directional training, the aggregated precoding vectors are computed by the CPU and the BS-specific precoding vectors are fed back to the corresponding BSs via backhaul signaling. Unlike the centralized precoding design in Algorithm 1, which hinges on a single pilot-aided CSI acquisition, the alternating optimization of the precoding and the combining vectors takes place by means of iterative bi-directional training between the CPU and the UEs through the BSs (as in the distributed precoding design in Algorithm 2). Hence, for fixed combining vectors, the CPU solves the estimated sum MSE minimization problem

(47)

For each UE , the first-order optimality condition of the resulting problem yields the centralized precoding solution444The equivalence between the centralized and the distributed precoding solutions in (48) and (46), respectively, can be shown in the same way as in the case with perfect CSI (see Section III).

   Data: Pilot sequences .
   Initialization: The CPU initializes the aggregated precoding vectors ; set .
   Until a predefined termination criterion is satisfied, do:
  • [leftmargin=12mm]

  • .

  • The CPU feeds back the BS-specific precoding vectors to each BS  via backhaul signaling.

  • DL: Each BS  transmits a superposition of the pilot sequences after precoding them with the corresponding precoding vectors (see in (15)); each UE  receives in (17).

  • Each UE  computes its combining vector as in (41).

  • UL-1: Each UE  transmits its pilot sequence after precoding it with its combining vector (see in (4)); each BS  receives in (6).

  • Each BS  forwards to the CPU via backhaul signaling.

  • The CPU computes the precoding vectors as in (48).

   End
Algorithm 3 (Centralized–iterative)
(48)
(49)

which is equal to (29) for perfect channel estimation. The implementation of the centralized precoding design with iterative bi-directional training is formalized in Algorithm 3. This scheme is used for comparative purposes in our numerical results in Section VI; however, the high computational complexity resulting from the centralized precoding design combined with the cumbersome backhaul signaling between the BSs and the CPU make its implementation highly impractical.

V Distributed Precoding Design via OTA Signaling

In this section, we propose a novel OTA signaling scheme that entirely eliminates the need for backhaul signaling for CSI exchange among the BSs and, hence, overcomes the practical limitations of the distributed precoding design described in Section IV-B. To this end, we introduce a new uplink signaling resource together with a new CSI combining mechanism that complement the existing uplink and downlink signaling described in Sections II-A and