# Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO Systems

Analog beamforming is an attractive and cost-effective solution to exploit the benefits of massive multiple-input-multiple-output systems, by requiring only one up/down-conversion chain. However, the presence of only one chain imposes a significant overhead in estimating the channel state information required for beamforming, when conventional digital channel estimation (CE) approaches are used. As an alternative, this paper proposes a novel CE technique, called periodic analog CE (PACE), that can be performed by analog hardware. By avoiding digital processing, the estimation overhead is significantly lowered and does not scale with number of antennas. PACE involves periodic transmission of a sinusoidal reference signal by the transmitter, estimation of its amplitude and phase at each receive antenna via analog hardware, and using these estimates for beamforming. To enable such non-trivial operation, two reference tone recovery techniques and a novel receiver architecture for PACE are proposed and analyzed, both theoretically and via simulations. Results suggest that in sparse, wide-band channels and above a certain signal-to-noise ratio, PACE aided beamforming suffers only a small loss in beamforming gain and enjoys a much lower CE overhead, in comparison to conventional approaches. Benefits of using PACE aided beamforming during the initial access phase are also discussed.

## Authors

• 3 publications
• 42 publications
01/25/2019

### Continuous Analog Channel Estimation Aided Beamforming for Massive MIMO Systems

Analog beamforming is an attractive solution to reduce the implementatio...
01/16/2018

### Joint CSI Estimation, Beamforming and Scheduling Design for Wideband Massive MIMO System

This paper proposes a novel approach for designing channel estimation, b...
01/29/2022

### Full-Duplex Non-Coherent Communications for Massive MIMO Systems with Analog Beamforming

In this paper, a novel full-duplex non-coherent (FD-NC) transmission sch...
05/06/2021

### Robust Analog Beamforming for Periodic Broadcast V2V Communication

We generalize an existing low-cost analog signal processing concept that...
08/21/2021

### An Attention-Aided Deep Learning Framework for Massive MIMO Channel Estimation

Channel estimation is one of the key issues in practical massive multipl...
04/24/2020

### Hybrid Combining of Directional Antennas for Periodic Broadcast V2V Communication

A hybrid analog-digital combiner for broadcast vehicular communication i...
04/29/2018

### Transmission Capacity of Full-Duplex MIMO Ad-Hoc Network with Limited Self-Interference Cancellation

In this paper, we propose a joint transceiver beamforming design to miti...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Massive Multiple-input-multiple-output (MIMO) systems, enabled by using antenna arrays with many elements at the transmitter (TX) and/or receiver (RX), promise large beamforming gains and improved spectral efficiency, and are therefore a key focus area for 5G systems research and development [1, 2]. Such massive antenna arrays, while also beneficial at sub- GHz frequencies, are essential at the higher millimeter-wave (mm-wave) frequencies to compensate for the large channel attenuation. However, despite their numerous benefits, full complexity massive MIMO architectures suffer from increased hardware cost and energy consumption. This is because, though the antenna elements are affordable, the corresponding up/down-conversion chains - which include circuit components such as analog-to-digital converters and digital-to-analog converters - are both expensive and power hungry [3]. A popular solution to reduce this implementation cost is hybrid beamforming [4, 5], where the large antenna array is connected to a small number of up/down-conversion chains via power-efficient and cost-effective analog hardware, such as, phase-shifters. By using such analog hardware to focus power into the dominant channel directions, hybrid beamforming exploits the directional nature of wireless channels to minimize loss in system performance. In this paper, we focus on a special case of hybrid beamforming with one up/down-conversion chain (for the in-phase and quadrature-phase components each), referred to as analog beamforming.

A major challenge with analog beamforming (and also hybrid beamforming in general) is the acquisition of the channel state information (CSI) required for beamforming at the TX and RX. In narrow-band (i.e., frequency-flat fading) systems [6, 7, 8, 9, 10, 11, 12], the required CSI usually involves instantaneous channel parameters (iCSI), while in wide-band systems [13, 14, 15, 16, 17, 18] average channel parameters (aCSI) are used for designing the analog beamformer. Here aCSI refers to channel parameters that remain constant over a wide time-frequency range, such as the spatial correlation matrices, while iCSI are parameters that change faster. In either scenario, the required CSI can be obtained by transmitting known signals (pilots) and performing channel estimation (CE) at the RX within each CSI coherence time, i.e., period over which CSI remains constant.111Required CSI at the TX is obtained either via CE on the reverse link, or via CSI feedback from RX. Since all RX antennas share one down-conversion chain, multiple temporal pilot transmissions are required for performing such CE [19, 20, 21, 16]. As an illustration, exhaustive CE approaches [20] require pilots, where are the number of TX and RX antennas, respectively and

represents the scaling behavior in big oh notation. Such a large pilot overhead may consume a significant portion of the time-frequency resources when the CSI coherence time is short, such as in vehicle-to-vehicle channels, in systems using narrow TX/RX beams, e.g., massive MIMO, or in channels with large carrier frequencies and high blocking probabilities, e.g., at mm-wave frequencies

[22]. The overhead also increases system latency and makes the initial access222Initial access refers to the phase wherein, a user equipment and base-station discover each other, synchronize, and coordinate to initiate communication. (IA) procedure very cumbersome [23, 24, 25]. Several fast CE approaches have therefore been suggested to reduce the pilot overhead, which are discussed below assuming for convenience.333For , the pilot overhead increases further, either multiplicatively or additively, by a function of , determined by the CE algorithm used at the TX. Side information aided narrow-band CE approaches utilize channel statistics and temporal correlation to reduce the iCSI pilot overhead [26, 27, 12, 21]. Compressed sensing based approaches [28, 19, 29, 30] exploit the sparse nature of the massive MIMO channels to reduce the pilots up to per CSI coherence time, where is the channel sparsity level. Iterative angular domain CE performs beam sweeping at the RX with progressively narrower search beams to find a good beam direction with pilots [31, 32, 25]. Approaches that utilize side information to improve iterative angular domain CE [33, 34] or perform angle domain tracking [35, 36] have also been considered. Sparse ruler based approaches exploit the possible Toeplitz structure of the spatial correlation matrix to reduce pilots to per CSI coherence time [37, 38, 39, 40, 16]. Since the overhead still scales with note1, these approaches are only partially successful in reducing the pilot overhead. Furthermore, some of these CE approaches may not be applicable for IA since they would require the timing and frequency synchronization [41, 42] to be performed without the TX/RX beamforming gain, which may be difficult at the low signal-to-noise ratio (SNR) and high phase noise (i.e., random fluctuations of the instantaneous oscillator frequency) levels expected in mm-wave systems. Some of these CE approaches also require the channel to remain static during the re-transmissions and are only applicable for certain antenna configurations and/or channel models. Finally, to reduce the impact of the transient effects of analog hardware on CE [43], the multiple pilots may have to be spaced sufficiently far apart[44], thus potentially increasing the latency.

The main reason for the pilot overhead is that conventional CE approaches require processing in the digital domain, thus having to time-share the down-conversion chain across the antennas. Inspired by ultra-wideband transmit reference schemes [45, 46, 47] and legacy adaptive antenna array techniques [48, 49, 50, 51], our recent conference papers [52, 53], explore a different novel approach that enables CE without digital processing. In this approach, the TX transmits a reference sinusoidal tone simultaneously with the data. The received reference signals (including both amplitude and phase) are then recovered at each RX antenna via analog hardware and are utilized as a homodyne combining filter for the data. In essence, [52, 53] show that a maximal ratio combining (MRC) beamformer built for a reference frequency also provides a good, albeit sub-optimal, beamforming gain at other frequencies in a sparse scattering, wide-band channel. This is because, although they experience frequency selective fading, such channels exhibit a strong coupling across frequency. Since recovering a reference sinusoidal signal, or equivalently estimating its amplitude and phase, is significantly simpler than conventional CE, it can be performed at each RX antenna by analog hardware such as phase locked loops. Thus, by avoiding digital CE, this scheme allows RX beamforming without pilot re-transmissions. We shall henceforth refer to this type of amplitude and phase estimation as analog channel estimation (ACE). Note that due to the limited capabilities of analog hardware and the low SNR before beamforming, performing ACE and exploring new ACE techniques is non-trivial. In the original design in [52], the reference has to be transmitted continuously, to enable its recovery at the RX. While this design reduces the estimation overhead and avoids phase-shifters, it requires carrier recovery circuits which may add to the cost and power consumption of the RX. Furthermore, the continuous recovery of the reference tone is an overkill, and may cause some wastage in the transmit power and spectral efficiency. In [53], a non-coherent variant of [52] is explored that avoids recovery circuits but at the expense of bandwidth efficiency reduction. The current paper therefore proposes a different ACE scheme, referred to as periodic ACE (PACE), where the reference is transmitted judiciously, and its amplitude and phase are explicitly estimated to drive an RX phase shifter array. Unlike [52], PACE requires one carrier recovery circuit and phase shifters (see Fig. 1) and can support both homo/heter-dyne reception.

In PACE, the TX transmits a reference tone at a known frequency during each periodic RX beamformer update phase. One carrier recovery circuit, involving phase-locked loops (PLLs), is used to recover the reference tone from one or more antennas, as shown in Fig. 1. This recovered reference tone, and its quadrature component, are then used to estimate the phase off-set and amplitude of the received reference tone at each RX antenna, via a bank of ‘filter, sample and hold’ circuits (represented as integrators in Fig. 1). As shall be shown, these estimates are proportional to the channel response at the reference frequency. These estimates are used to control an array of variable gain phase-shifters, which generate the RX analog beam. During the data transmission phase, the wide-band received data signals pass through these phase-shifters, are summed and processed similar to conventional analog beamforming. As the phase and amplitude estimation is done in the analog domain, pilots are sufficient to update the RX beamformer. Additionally, the power from multiple channel MPCs is accumulated by this approach, increasing the system diversity against MPC blocking. Furthermore, the same variable gain phase-shifts can also be used for transmit beamforming on the reverse link. Finally, by providing an option for digitally controlling the inputs to the phase-shifters, the proposed architecture can also support conventional beamforming approaches.

On the flip side, PACE requires some additional analog hardware components, such as mixers and filters, in comparison to conventional digital CE. Additionally, the accumulation of power from multiple MPCs may cause frequency selective fading in a wide-band scenario, which can degrade performance. Finally, the proposed approach in its current suggested form does not support reception of multiple spatial data streams and can only be used for beamforming at one end of a communication link. This architecture is therefore more suitable for use at the user equipment (UEs). The possible extensions to multiple spatial stream reception shall be explored in future work. While the proposed architecture is also applicable in narrow-band scenarios, in this paper we shall focus on the analysis of a wide-band scenario where the repetition interval of PACE and beamformer update is of the order of aCSI coherence time, i.e. time over which the aCSI stays approximately constant (also called stationarity time in some literature).

The contributions of this paper are as follows:

1. We propose a novel transmission technique, namely PACE, and a corresponding RX architecture that enable RX analog beamforming with low CE overhead.

2. To enable the RX operation, we also explore two novel reference recovery circuits. These circuits are non-linear, making their analysis non-trivial. We provide an approximate analysis of their phase-noise and the resulting performance that is tight in the high SNR regime.

3. We analytically characterize the achievable system throughput with PACE aided beamforming in a wide-band channel.

4. Simulations with practically relevant channel models are used to support the analytical results and compare performance to existing schemes.

The organization of the paper is as follows: the system model is presented Section II; two designs for PACE and their respective noise analysis is presented in Section III; the system performance with PACE aided beamforming is characterized in Section IV; the advantages of PACE for transmit beamforming and during the IA phase are discussed in Section V; simulations results are presented in Section VI and finally conclusions are in Section VII.

Notation:

scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally,

, is the complex conjugate of a complex scalar , represents the -norm of a vector and is the conjugate transpose of a complex matrix . Finally, represents the expectation operator, represents the Kronecker product, represents equality in distribution, / refer to the real/imaginary component, respectively, represents a circularly symmetric complex Gaussian vector with mean and covariance matrix ,

represents an exponential distribution with mean

and

represents a uniform distribution in range

.

## Ii General Assumptions and System model

We consider the downlink of a single-cell MIMO system, wherein one base station (BS) with antennas transmits to several UEs with antennas each. Since focus is on the downlink, we shall use abbreviations BS & TX and UE & RX interchangeably. Each UE is assumed to have one up/down-conversion chain, while no assumptions are made regarding the BS architecture.

Here we assume the communication between the BS and UEs to involve three important phases: (i) initial access (IA) - where the BS and UEs find each other, timing/frequency synchronization is attained and spectral resources are allocated; (ii) analog beamformer design - where the BS and UEs obtain the required aCSI to update the analog precoding/combining beams; and (iii) data transmission. The relative time scale of these phases are illustrated in Fig. 2. Through most of this paper (Sections II-IV), we assume that the IA and beamformer design at the BS are already achieved, and we mainly focus on the beamformer design phase at the UE and the data transmission phase. Therefore we assume perfect timing and frequency synchronization between the BS and UE, and assume that the TX beamforming has been pre-designed based on aCSI at the BS. Later in Section V, we also briefly discuss how aCSI can be acquired at the BS, how IA can be performed and how the use of PACE can be advantageous in those phases.

The BS transmits one spatial data-stream to each scheduled UE, and all such scheduled UEs are served simultaneously via spatial multiplexing. Furthermore, the data to the UEs is assumed to be transmitted via orthogonal precoding beams, such that, there is no inter-user interference.444This type of precoding is possible by avoiding transmission to the scatterers common to multiple scheduled UEs [27]. Under these assumptions and given transmit precoding beams and power allocation, we shall restrict the analysis to one representative UE without loss of generality. For convenience, we shall also assume the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and UE. An analysis including the non-linear effects of these components is beyond the scope of this paper. The BS transmits orthogonal frequency division multiplexing (OFDM) symbols with sub-carriers, indexed as with , to this representative UE.555While the proposed PACE technique is also applicable to single carrier transmission, a detailed analysis of the same is beyond the scope of this paper. The BS transmits two kinds of symbols: reference symbols and data symbols. In a reference symbol, only a reference tone, i.e., a sinusoidal signal with a pre-determined frequency known both to the BS and UE, is transmitted on the -th subcarrier, and the remaining sub-carriers are all empty. On the other hand, in a data symbol all the sub-carriers are used for data transmission.666In an actual implementation the data symbols may have may also have null and pilot sub-carriers, but we ignore them here for simplicity. The purpose of the reference symbols is to aid PACE and beamformer design at the RX, as shall be explained later. Since the BS can afford an accurate oscillator, we shall assume that the BS suffers negligible phase noise. The complex equivalent transmit signal for the -th symbol, if it is a reference or data symbol, respectively, can then be expressed as:

 ~s(r)tx(t) = √2Tcst√E(r)ej2πfct (1a) ~s(d)tx(t) = √2Tcst[∑k∈Kx(d)kej2πfkt]ej2πfct, (1b)

for , where is the unit-norm TX beamforming vector for this UE with , is the data signal at the -th OFDM sub-carrier, , is the carrier/reference frequency, represents the frequency offset of the -th sub-carrier, and are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by . For the data symbols, we assume the use of Gaussian signaling with , for each . The total average transmit OFDM symbol energy (including cyclic prefix) allocated to the UE is defined as , where and . For convenience we also assume that is a multiple of , which ensures that the reference tone has the same initial phase in consecutive reference symbols.

The channel to the representative UE is assumed to be sparse with resolvable MPCs (), and the corresponding channel impulse response matrix is given as [22]:

 H(t)=L−1∑ℓ=0αℓarx(ℓ)atx(ℓ)†δ(t−τℓ), (2)

where is the complex amplitude and is the delay and are the TX and RX array response vectors, respectively, of the -th MPC. As an illustration, the -th RX array response vector for a uniform planar array with horizontal and vertical elements () is given by , where we define:

 ¯arx(ψrxazi,ψrxele)≜ (3)

, are the azimuth and elevation angles of arrival for the -th MPC, are the horizontal and vertical antenna spacings and is the wavelength of the carrier signal. Expressions for can be obtained similarly. Note that in (2) we implicitly assume frequency-flat MPC amplitudes and ignore beam squinting effects [54], which are reasonable assumptions for moderate system bandwidths. To prevent inter symbol interference, we also let the cyclic prefix be longer than the maximum channel delay: . To model a time varying channel, we treat as aCSI parameters, that remain constant within an aCSI coherence time and may change arbitrarily afterwards.777While each MPC may contain several unresolved sub-paths, the corresponding set of scatterers are usually co-located. Therefore the relative sub-path delays and resulting MPC amplitude are expected to vary slowly with the TX/RX movement. However since the channel is more sensitive to delay variations, the MPC delays are modeled as iCSI parameters that only remain constant within a shorter interval called the iCSI coherence time. Note that this time variation of delays is an equivalent representation of the Doppler spread experienced by the RX. Finally, we do not assume any distribution prior or side information on .

The RX front-end is assumed to have a low noise amplifier followed by a band-pass filter at each antenna element that leaves the desired signal un-distorted but suppresses the out-of-band noise. The filtered complex equivalent received waveform for the -th symbol can then be expressed as:

 ~s(⋅)rx(t)=L−1∑ℓ=0αℓarx(ℓ)atx(ℓ)†~s(⋅)tx(t−τℓ)+√2~w(⋅)(t)ej2πfct (4)

for , where , is the complex equivalent, base-band, stationary, additive, vector Gaussian noise process, with individual entries being circularly symmetric, independent and identically distributed (i.i.d.), and having a power spectral density: for . During the data transmission phase, the received data waveform is phase shifted by a bank of phase-shifters, whose outputs are summed and fed to a down-conversion chain for data demodulation, as in conventional analog beamforming. However unlike conventional CE based analog beamforming, the control signals to the phase-shifters are obtained using the reference symbols and using PACE, as shall be discussed in the next section.

## Iii Analog beamformer design at the receiver

During each beamformer design phase, the BS transmits consecutive reference symbols to facilitate PACE at the RX. This process involves two steps: locking a local RX oscillator to the received reference tone and using this locked oscillator to estimate the amplitude and phase-offsets at each antenna.888Note that IA based time/frequency synchronization usually involves digital post-processing. Thus prior IA based synchronization does not guarantee that an RX oscillator is locked to the reference tone. Here locking refers to ensuring that the phase difference between the oscillator and the received reference tone is approximately constant. The first reference symbols are used for the former step and the remaining symbols are used for the latter step. Therefore is independent of and is mainly determined by the time required for oscillator locking (see Remark III.1). The first step shall be referred to as recovery of the reference tone and is analyzed in Section III-A and while the latter step is discussed in Section III-B. As shall be shown both steps are significantly impaired by channel noise. Therefore in Section III-C, we propose an improved architecture for reference tone recovery that provides better noise performance, albeit with a slightly higher hardware complexity. For convenience, we shall assume that the MPC delays do not change within the beamformer design phase, and are represented as (see also Remark III.2). However the delays may be different during the data transmission phase, as shall be considered in Section IV. Without loss of generality, assuming the first reference symbol to be the -th OFDM symbol, the complex equivalent RX signal for the reference symbols at antenna can be expressed as:999The component of for suffers inter-symbol interference and hence is not included here.

 ~s(r)rx,m(t) = √2A(r)mej2πfct+√2~w(r)m(t)ej2πfct (5)

for , where is the amplitude of the reference tone at antenna .

### Iii-a Recovery of the reference tone - using one PLL

For locking a local RX oscillator to the reference signal, we first consider the use of a type 2 analog PLL at RX antenna , as illustrated in Fig. 3. The PLL is a common carrier-recovery circuit - with a mixer, a loop low pass filter () a variable loop gain () and a voltage controlled oscillator () arranged in a feedback mechanism - that can filter the noise from an input noisy sinusoidal signal (see [55, 56] for more details).

Here is assumed to be a first-order active low-pass filter with a transfer function and the loop gain is assumed to adapt to the amplitude of the input such that .101010Such a variable gain can possibly be implemented by using an automatic gain control circuit. For convenience, we also ignore the VCO’s internal noise [57, 58]. Without loss of generality, let the output of the VCO (i.e. the recovered reference tone) be expressed as:

 sPLL(t)=svco(t)=√2cos[2πfct+¯θ+θ(t)] (6)

where may be arbitrary and we define such that . Then the stochastic differential equation governing (6) for is given by [56]:

 2πfc+dθ(t)dt=LF{Re{~srx,1(t)}√2cos[2πfct+¯θ+θ(t)]}G +2πfvco =LF{Re[A(r)1e−j[¯θ+θ(t)]+~w(r)1(t)e−j[¯θ+θ(t)]]}G+2πfvco (7)

where is the free running frequency of the VCO with no input, we use (5) and assume is much larger than the bandwidth of . In this subsection, we are interested in finding the time required for locking (), i.e., for to (nearly) converge to a constant and characterizing the distribution of the PLL output , or equivalently , during the last reference symbols when the PLL is locked to the reference tone. The first part is answered by the following remark:

###### Remark III.1.

For the PLL considered, the phase lock acquisition time is in the no noise scenario [55, 56]. Thus and must be of the orders of and respectively, to keep small.

Numerous techniques [59, 60] have been proposed to further reduce the lock acquisition time, which are not explored here for brevity. In the locked state, it can be shown that suffers from random fluctuations due to the input noise in (III-A), and that () is approximately a zero mean random process [55, 56]. This fluctuation manifests as phase noise of . While several attempts have been made to characterize the locked state (see [56, 55] and references therein), closed form results are available only for a few simple scenarios that are not applicable here. Therefore, for analytical tractability, we linearize (III-A) using the following widely used approximations [56]:

1. We neglect cycle slips and assume that the deviations of about its mean value are small, such that in the locked state.

2. We assume that the distribution of the base-band noise process is invariant to multiplication with , i.e., is also a Gaussian noise process with power spectral density .

Approximation 1 is accurate in the locked state and in the large SNR regime, while Approximation 2 is accurate when the noise bandwidth is much larger than the loop filter bandwidth [61, 56]. Using these approximations and the definition of , we can linearize (III-A) as:

 dθL(t)dt = LF{−|A(r)1|θL(t)+^w(r)1(t)+[^w(r)1(t)]∗2}G (8) −2π[fc−fvco]

where we replace by to denote use of the linear approximation. Note that for sufficient SNR, () during the last reference symbols. Assuming and the PLL input to be for and taking the Laplace transform on both sides of (8), we obtain:

 sΘL(s) = GLF(s)⎡⎢⎣−|A(r)1|ΘL(s)+^W(r)1(s)+[^W(r)1(s∗)]∗2⎤⎥⎦ (9) −2π[fc−fvco]s

where and are the Laplace transforms of and , respectively. It can be verified using the final value theorem that the contribution of the last term on the right hand side of (9) vanishes for (i.e., in locked state). Therefore ignoring this term in (9), we observe that is a zero mean, stationary Gaussian process [57]

, in the locked state. Furthermore, the locked state power spectral density, auto-correlation function and variance of

can then be computed, respectively, as:

 SθL(f)=E|ΘL(j2πf)|2 =|G|2(4π2f2+ϵ2)Sw(f)2∣∣−4π2f2+G(j2πf+ϵ)|A(r)1|∣∣2 (10) RθL(τ)=∫∞−∞SθL(f)ej2πftdt ≈|G|2N04[a2−ϵ2a(a2−b2)e−a|t|+b2−ϵ2b(b2−a2)e−b|t|] (11) Var{θL(t)}=RθL(0)≤N0|A(r)1|G+ϵ4|A(r)1|2, (12)

where , , (11)–(12

) follow from finding the inverse Fourier transform via partial fraction expansion and the final expressions follow by observing that

for all . Since is stationary and Gaussian in locked state, note that its distribution is completely characterized by (10)–(12).

### Iii-B Phase and amplitude offset estimation

This subsection analyzes the procedure for reference signal phase and amplitude offset estimation at each RX antenna. As illustrated in Fig. 1, the PLL signal from antenna is fed to a phase shifter to obtain its quadrature component. From (6), the in-phase and quadrature-phase components of the PLL signal for can be expressed together as:

 ~sPLL(t)=√2ej[2πfct+¯θ+θ(t)]. (13)

At each RX antenna, the received reference signal is multiplied by the in-phase and quadrature-phase components of the PLL signal, and the resulting outputs are fed to ‘filter, sample and hold’ circuits. This circuit involves a low pass filter with a bandwidth of , followed by a sample and hold circuit that samples the filtered output at the end of the reference symbols. For convenience, in this paper we shall approximate this ‘filter, sample and hold’ by an integrate and hold operation as depicted in Fig. 1. Representing the ‘filter, sample and hold’ outputs corresponding to the in-phase and quadrature-phase components of the PLL output as real and imaginary respectively, the complex sample and hold vector can be approximated as:

 IPACE≈1D2∫T2T1Re{~s(r)rx(t)}~s∗PLL(t)dt =1D2∫T2T1[√1Tcs^H(0)t√E(r)e−j[¯θ+θ(t)]+^w(r)(t)]dt, (14)

where is a scaling factor, , , is the frequency-domain channel matrix for the -th subcarrier during beamformer design phase and is an i.i.d. Gaussian noise process vector with power spectral density (see Approximation 2). Note that in locked state (), we have (), as per approximations 1 and 2. Furthermore from (11), the auto-correlation function of decays exponentially with a time constant of . Therefore, for , experiences enough independent realizations of . Therefore replacing the integral in (14) with an expectation over phase noise, we have:

 IPACE (1)≈ √Tcs^H(0)t√E(r)e−j¯θE{e−jθL(t)}+∫T2T1^w(r)(t)D2dt (15) (2)= √Tcs^H(0)t√E(r)e−j¯θe−Var{θL(t)}2+√Tcs^W(r),

where follows from the fact that () in locked state, follows by defining

and by using the characteristic function for the stationary Gaussian process

. Since is i.i.d. Gaussian with a power spectral density , it can be verified that when . From (15), note that the signal component of the sample and hold output is directly proportional to the channel matrix at the reference frequency. The outputs are used as a control signals to the RX phase-shifter array, to generate the RX analog beam to be used during the data transmission phase. From (15) and (12), note that either or can be increased, to reduce the impact of noise on the analog beam. Since is a non-decreasing function of (see (5)), this implies that should be kept as large as possible while satisfying and meeting the spectral mask regulations.

Note that the results in this section are based on several approximations, including the linear phase noise analysis in Section III-A. To test the accuracy of these results, the numerical values of , obtained by simulating realizations of from (III-A), are compared to its analytic approximation in Fig. 4. Note that this comparison reflects the accuracy of the approximation in (15). As is evident from Fig. 4, (15) is accurate above a certain SNR. Additionally, since decays exponentially with (see (15)), we observe from Fig. (a)a that the mean integrator output drops drastically below a certain threshold SNR. As shall be shown in Section IV, such a drop in the mean causes a sharp degradation in the system performance below this threshold SNR. Therefore in the next subsection we propose a better reference recovery circuit, called weighted carrier arraying, that reduces the SNR threshold.

###### Remark III.2.

The preceding derivations assumed that the MPC delays are identical for the reference symbols. However since the PLL continuously tracks the RX signal and phase/amplitude estimation at each antenna is performed simultaneously, these results are valid even if the delays change slowly within the beamformer design phase.

###### Remark III.3.

The RX phase-shifter array or the down-conversion chain are not utilized during the reference symbols of the beamformer design phase. Therefore, data reception is also possible during these reference symbols in parallel, as long as a sufficient guard band between the data sub-carriers and the reference sub-carrier is provided (similar to (27)) to reduce impact on the PLL performance.

Note that in a multi-cell scenario, use of the same reference tone in adjacent cells can cause reference tone contamination, i.e., may contain components corresponding to the channel from a neighboring BS. This is analogous to pilot contamination in conventional CE approaches [1], and can be avoided by using different, well-separated reference frequencies in adjacent cells.

### Iii-C Recovery of the reference tone - using weighted carrier arraying

For reducing the PLL SNR threshold and improving performance, in this subsection we propose a new reference recovery technique called weighted carrier arraying, as illustrated in Fig. 5. Apart from a main primarly PLL, weighted carrier arraying has secondary PLLs at a subset of antennas, which compensate for the inter-antenna phase shift. The resulting phase compensated signals from the antennas are weighted, combined and tracked by the primary PLL, which operates at a higher SNR and with a wider loop bandwidth than the secondary PLLs. Note that this architecture can be interpreted as a generalization of the carrier recovery process in [62, 63, 50, 51] that allows weighted combining. We shall next analyze the performance of this arrayed PLL in the locked state. However, an analysis of the transient behavior and lock acquisition time of this design is beyond the scope of this paper.

In Fig. 5, refer to low-pass and band-pass filters with wide bandwidths, designed only to remove the unwanted side-band of the mixer outputs. Without loss of generality, we express the outputs of the primary and secondary VCOs as:111111Another convergence point for is at a frequency of . But the final results presented here are also valid for this alternate convergence point.

 spvco(t) = √2cos[2π(fc−fIF)t+θ(t)] ssvco,m(t) = √2cos[2πfIFt+¯ϕm+ϕm(t)],  m∈M

respectively, where are arbitrary, is the common free running frequency of the secondary VCOs, and are such that for all . Now similar to Section III-A, from (5) the differential equation governing the secondary PLL at antenna can be expressed as:

 dϕm(t)dt = Re[A(r)me−j[¯ϕm+ϕm(t)+θ(t)] (16) +~w(r)m(t)e−j[¯ϕm+ϕm(t)+θ(t)]]Gsm√2 = Re[−j|A(r)m|e−j[ϕm(t)+θ(t)]+^w(r)m(t)]Gsm√2

where we define and is the loop gain of the secondary VCO at antenna . Similarly, for the primary VCO we have:

 2π(fc−fIF)+dθ(t)dt = LF{∑m∈MRe[−j|A(r)m|e−j[ϕm(t)+θ(t)] (17)

where is the free running frequency of the primary VCO, is the loop gain and is an active low pass filter with transfer function . Similar to Section III-A, to obtain the locked state distribution of we shall rely on the linear PLL analysis by using: 1) , which is accurate in the high SNR locked state where and 2) , which is accurate for a wide noise bandwidth. Using these approximations in (16)–(17) with zero initial conditions and taking Laplace transforms, we obtain:

 sΦLm(s)=(−|A(r)m|[ΦLm(s)+ΘL(s)] +^W(r)m(s)+[^W(r)m(s∗)]∗2)Gsm√2 (18a) sΘL(s)=LF(s)∑m∈M[