# Message passing-based joint CFO and channel estimation in millimeter wave systems with one-bit ADCs

Channel estimation at millimeter wave (mmWave) is challenging when large antenna arrays are used. Prior work has leveraged the sparse nature of mmWave channels via compressed sensing based algorithms for channel estimation. Most of these algorithms, though, assume perfect synchronization and are vulnerable to phase errors that arise due to carrier frequency offset (CFO) and phase noise. Recently sparsity-aware, non-coherent beamforming algorithms that are robust to phase errors were proposed for narrowband phased array systems with full resolution analog-to-digital converters (ADCs). Such energy based algorithms, however, are not robust to heavy quantization at the receiver. In this paper, we develop a joint CFO and wideband channel estimation algorithm that is scalable across different mmWave architectures. Our method exploits the sparsity of mmWave MIMO channel in the angle-delay domain, in addition to compressibility of the phase error vector. We formulate the joint estimation as a sparse bilinear optimization problem and then use message passing for recovery. We also give an efficient implementation of a generalized bilinear message passing algorithm for the joint estimation in mmWave systems with one-bit ADCs. Simulation results show that our method is able to recover the CFO and the channel compressively, even in the presence of phase noise.

## Authors

• 6 publications
• 33 publications
• ### Gridless Angular Domain Channel Estimation for mmWave Massive MIMO System With One-Bit Quantization Via Approximate Message Passing

We develop a direction of arrival (DoA) and channel estimation algorithm...
09/23/2019 ∙ by Liangyuan Xu, et al. ∙ 0

• ### mmWave Channel Estimation via Approximate Message Passing with Side Information

This work considers millimeter-wave channel estimation in a setting wher...
05/05/2020 ∙ by Dror Baron, et al. ∙ 0

• ### Message passing-based link configuration in short range millimeter wave systems

Millimeter wave (mmWave) communication in typical wearable and data cent...
07/11/2019 ∙ by Nitin Jonathan Myers, et al. ∙ 0

• ### Joint Channel-Estimation/Decoding with Frequency-Selective Channels and Few-Bit ADCs

We propose a fast and near-optimal approach to joint channel-estimation,...
07/06/2018 ∙ by Peng Sun, et al. ∙ 0

• ### Relay-Aided Channel Estimation for mmWave Systems with Imperfect Antenna Arrays

Compressed Sensing (CS) based channel estimation techniques have recentl...
06/01/2019 ∙ by Mohammed E. Eltayeb, et al. ∙ 0

• ### Super-Resolution mmWave Channel Estimation using Atomic Norm Minimization

We propose super-resolution MIMO channel estimators for millimeter-wave ...
01/23/2018 ∙ by Hongyun Chu, et al. ∙ 0

• ### Generalized Approximate Message Passing for Massive MIMO mmWave Channel Estimation with Laplacian Prior

This paper tackles the problem of millimeter-Wave (mmWave) channel estim...
03/05/2019 ∙ by Faouzi Bellili, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Millimeter wave communication introduces new challenges in the design of MIMO communication systems [1]. For instance, large antenna arrays at the transmitter (TX) and the receiver (RX) are necessary to meet the link budget requirements [2]. As a result, the channel has a higher dimension compared to what is typical in lower frequency MIMO systems, and must be estimated more frequently thanks to the smaller coherence time [3]. Furthermore, cost and power consumption are major issues at the larger bandwidths that accompany mmWave, primarily due to high resolution ADCs [4]. Typical mmWave hardwares that limit power consumption at large bandwidths introduce compression in the channel measurements. For example, the one-bit ADC architecture [4] allows access to the output of every antenna at the expense of heavy quantization. The compression of channel measurements and the use of large antenna arrays complicate signal processing at mmWave.

Compressed sensing (CS) [5][6] is an efficient technique to recover sparse high-dimensional signals with few projections. As MIMO channel matrices at mmWave are sufficiently sparse when expressed in an appropriate dictionary, applying tools from CS to mmWave channel estimation can potentially reduce the training overhead. CS-based sparse channel estimation algorithms have been proposed for various hardware architectures [7, 8, 9]. Recent developments in approximate message passing [10] [11] have enabled channel estimation algorithms in low resolution receivers [12]. Most CS-based channel estimation algorithms, however, assume perfect synchronization and fail in practice because of the CFO and phase noise [13].

CFO and phase noise are hardware impairments that corrupt the phase of the channel measurements. The mismatch between the carrier frequencies of the local oscillators at the TX and the RX results in CFO. Phase noise in the system arises due to short-term random fluctuations in the frequency of the oscillators. Both these non-idealities are larger at mmWave due to the high carrier frequency and ignoring them can result in significant channel estimation error [14]. Correcting for the CFO and then performing channel estimation seems like a possible solution. The disadvantage, however, is that prior to beamforming or channel estimation, mmWave systems operate at very low SNR, which can result in significant error in the CFO estimate. Prior work has considered joint CFO and channel estimation [15][16] in lower frequency systems. These joint estimation algorithms, however, cannot be applied to typical mmWave systems due to differences in the hardware architectures. Furthermore, they are not designed to incorporate the sparse nature of mmWave channels. Therefore, there is a need to design either phase error robust channel estimation algorithms, or joint CFO and channel estimation algorithms that can exploit the sparsity of mmWave channel.

Recent work on phase error robust channel estimation is limited to narrowband systems and has focussed on specific mmWave hardware. In [13],[17] and [18], phase error robust compressive beamforming algorithms were proposed for the analog beamforming architecture. In [13], phase tracking followed by phase error compensated compressive beamforming was proposed. The compensation, however, was done prior beamforming and therefore suffers from low SNR. The non-coherent algorithms in [17] and [18] are not robust to heavy quantization at the receiver. These algorithms cannot be used in one-bit receivers because the energy information of the channel measurements is completely lost due to one-bit quantization.

Recent joint CFO and sparse narrowband channel estimation algorithms in [14] and [19] require high computational complexity when extended to wideband systems. In [14]

, we proposed a joint CFO and narrowband channel estimation algorithm using third-order tensors

[20]. We also developed a sparsity-aware joint estimation algorithm for the one-bit ADC architecture in [19]. The main idea underlying our approach in [19] was to use the lifting technique [21] along with message passing for the joint estimation with one-bit channel measurements. Extending the narrowband solutions in [14] or [19] to typical wideband mmWave systems would require convex optimization over millions of variables, which may be prohibitive in a practical setting.

In this paper, we propose a sparse bilinear formulation of the joint CFO and wideband channel estimation problem, and solve it using message passing. We assume that all the RF chains at the TX or the RX are driven by the same reference oscillator. Hence, there is a unique CFO and a phase noise process in our MIMO system model, that corrupt the channel measurements. We also assume that there is perfect frame timing synchronization between the TX and the RX. We summarize the main contributions of our work as follows.

• We formulate the joint CFO and wideband channel estimation problem as a noisy quantized sparse bilinear optimization problem. Our framework leverages the sparse nature of the wideband channel in the angle and delay domains, and also exploits the compressibility of the phase error vector in the frequency domain.

• To solve the non-convex problem at hand, we use the vector variance version of the Parametric Bilinear Generalized Approximate Message Passing (PBiGAMP) algorithm

[22]

and optimize it for fast joint estimation. The parameters of the sparse priors corresponding to the wideband channel and the phase error vector are learned using an Expectation Maximization (EM) algorithm.

• We provide insights into the design of training matrices for joint CFO and channel estimation using PBiGAMP. Specifically, we show that shifted Zadoff-Chu training proposed in [12] to accelerate message passing cannot be used for joint estimation as it results in a continuum of optimal solutions for the bilinear optimization problem. We explain the “CFO propagation effect” to highlight the trade-off between fast message passing and identifiability in the sparse bilinear problem.

• We evaluate the performance of our joint estimation algorithm assuming a digital receiver architecture with one-bit ADCs and compare it with the hypothetical full resolution case. Simulation results show that the proposed approach is able to recover both the channel and the CFO compressively with IID Gaussian and IID QPSK training matrices, even in the presence of phase noise uncertainity.

Our algorithm is advantageous over the existing sparsity-aware methods for joint estimation or phase error robust channel estimation in terms of the capability to efficiently handle frequency selective channels and scalability to other mmWave architectures.

Notation is a matrix, is a column vector and denote scalars. Using this notation and represent the transpose, conjugate and conjugate transpose of . The matrices and contain the element-wise magnitude and squared magnitude of the entries of . We use and to denote the row and column of . We use to denote a diagonal matrix with entries of on its diagonal. The scalar denotes the element of . The symbol is used to denote the kronecker product. is a vector obtained by stacking all the columns of and denotes the element of . We define . We use to denote the set . The matrix

denotes the unitary discrete Fourier transform matrix.

is the probability density function of complex Gaussian random vector with mean

and covariance . We define as the dimensional canonical basis vector with its coordinate as 1.

## Ii System and Channel Models

In this section, we describe the underlying hardware architecture, CFO and phase noise model, and the wideband mmWave channel model used for our simulations. In particular, we focus on the digital receiver architecture with one-bit ADCs, to highlight the differences with the existing non-coherent algorithms. Nevertheless, our algorithm can be extended to other mmWave architectures as the underlying joint estimation problem is bilinear in nature.

### Ii-a System Model

We consider a MIMO system with uniform linear array of antennas at the TX and antennas at the RX, as shown in Fig. 1. We use linear arrays for a concise representation of the simplifications involved in PBiGAMP; our framework can be extended to other array geometries using appropriate array response vectors in the formulation. We do not impose constraints on the number of RF chains or the resolution of the digital-to-analog converters (DACs) at the TX. The resolution of the ADCs at the RX, however, is assumed to be limited. The baseband signal at the TX is upconverted to the mmWave band, using a local oscillator at a carrier frequency . The transmitted RF signal propagates through the wireless channel and is downconverted at the RX using a carrier frequency , that slightly differs from . Although the MIMO system can have multiple RF chains, we assume that all the RF chains at a given end are driven by the same reference oscillator. Even if the RF chains at a given end were driven by different oscillators, achieving carrier synchronization locally is feasible [23]. After downconversion at the RX, the output at each antenna is sampled using a pair of bit ADCs, one each for the in-phase and the quadrature phase components. We use to represent the bit quantization function corresponding to the ADCs. In this work, we consider the extreme case of and provide a performance comparison relative to . The quantization functions for the two cases are and . Note that the functions and are applied element-wise on the vector.

The impact of CFO on channel estimation algorithms is more significant in one-bit receivers than the full resolution ones. The mismatch in the carrier frequencies, i.e., is typically in the order of several parts per millions (ppms) of or . Due to the high carrier frequencies at mmWave, even such small differences can significantly perturb the channel estimate when ignored [13]. For a symbol duration of seconds, we define the digital domain CFO as . CFO results in unknown phase errors in the received samples that linearly increase with time. Hence, the impact of CFO on standard channel estimation algorithms is determined by the length of training. As one-bit receivers relatively need a longer training for channel estimation when compared to the full resolution ones, channel estimation algorithms that ignore phase errors are more vulnerable to the CFO in one-bit systems than the full resolution ones.

Phase noise in wireless systems arises due to jitter in the frequency of the oscillators. For a phase noise variance of at the TX and at the RX, the phase noise variance in the received samples can be approximated as . The approximation is valid when the bandwidth of the phase noise power spectral density is significantly smaller than the channel coherence bandwidth [24]. Let denote the phase error introduced in the received sample, due to phase noise at the TX and the RX. As is common in prior work, we model the phase errors using a Wiener process [25] in which the increments, i.e.,

, are IID Gaussian random variables with zero mean and variance

. As is proportional to [13], phase noise is higher at mmWave carrier frequencies for a given quality of oscillator.

Now, we describe the received signal model in the digital receiver architecture. Let be the transmit symbol satisfying the power constraint . The discrete time baseband representation of the MIMO channel is assumed to be limited to taps. Let be the tap of the equivalent baseband channel, where . We assume that perfect frame timing synchronization can be achieved using the control channel. Our assumption can be justified in situations where the mmWave system co-exists with a lower frequency system [26] that can perform timing synchronization. Frequency synchronization, however, may not be achieved due to different offsets and phase noise processes for each of these systems. With the timing synchronization assumption, the sampled baseband vector in the symbol duration can be given by

 Y(n)=Qq(ej(ϵn+ϕn)L−1∑ℓ=0H[ℓ]t[n−ℓ]+V(n)), (1)

where is additive white Gaussian noise. In this work, we develop an algorithm to estimate and from the series of observations . Our joint estimation algorithm can be extended to any -bit ADC architecture by defining appropriate output likelihood functions in message passing.

### Ii-B Channel Model

We consider a clustered channel model for the frequency selective mmWave MIMO channel. The channel consists of clusters with rays in the cluster. Let , , and denote the complex gain, delay, angle-of-arrival (AoA) and angle-of-departure (AoD) of the ray in the cluster. We assume that the transmitted signal is bandlimited to . With , and the Vandermonde vector

 aN(Δ)=[1,ejΔ,ej2Δ,⋯,ej(N−1)Δ]T, (2)

the tap of the wideband MIMO channel for a half wavelength spaced uniform linear array is given by

 H[ℓ]=Ncs∑n=1Mn∑m=1γn,maNrx(ωr,n,m)a∗Ntx(ωt,n,m)sinc(ℓ−τn,mT). (3)

The wideband channel can be represented using complex entries, and the matrix in (3) is large in typical mmWave systems. The channel impulse response in (3) is represented using a linear combination of bandlimited functions. Notice that each of these functions is delayed by the normalized delay spread, i.e, and evaluated at periodic time instants to obtain the discrete time representation in (3). Other filtering functions could also be used to incorporate the effect of pulse shaping at the TX or filtering at the RX [27].

The mmWave MIMO channel is aproximately sparse in an appropriate dictionary due to the propagation characteristics of the environment at mmWave frequencies. Compared to the lower frequency channels, mmWave channels comprise of fewer clusters [4]. Each of the channel taps , is approximately sparse in the spatial Fourier basis at mmWave [12]. Furthermore, the channel is approximately sparse along the time dimension as the delays of the propagation rays are heavily clustered within the delay spread. As the delays may not necessarily be an integer multiple of , there is a leakage effect along the time dimension. Let be the 2-D Fourier transform of , such that

 H[ℓ]=UNrxC[ℓ]U∗Ntx,∀ℓ∈{0,1,2,...,L−1}. (4)

The approximate sparsity of the mmWave MIMO channel along the angle and delay domains [28] is directly reflected in the matrices . A higher resolution dictionary can be used for the angle and delay dimensions to increase sparsity of the mmWave channel at the expense of higher dimensionality and higher frame coherence [29]. Our GAMP based approach will be robust to leakage effects that arise due to approximate sparsity.

## Iii Demystifying joint CFO and channel estimation at mmWave

In this section, we propose a sparse bilinear formulation for the joint estimation problem. We also identify existing techniques to solve the problem and describe their limitations in terms of scalability to other mmWave architectures and computational complexity.

### Iii-a Bilinear formulation

We derive a compact form for the received signal model in (1) for a SC-FDE system [3]. Let be a training block of length such that . The TX transmits a training sequence with a cyclic prefix of length , i.e.,

. The cyclic prefix padded transmission gets convolved with the frequency selective MIMO channel before sampling at the receiver. As usual, the first

samples of the received block that experience interference from the previous transmit block are discarded [3]. Let be the received block obtained after discarding the first received vectors. We define an circulant delay matrix , such that its first column is the canonical basis vector . We define a vector such that its entry has the phase error corresponding to the received vector , i.e., . Note that

is the standard deviation of the incremental phase errors in the Wiener phase noise process. The received samples corresponding to the transmit block defined by

can be expressed using (1) as

 Y=Qq(L−1∑ℓ=0H[ℓ]TJℓdiag(d(ϵ,β))+V). (5)

The phase errors in (5) are invariant along any column of the unquantized received block as there is a unique CFO and a phase noise process in the system. The representation in (5) can be further simplified to capture the structure in the phase errors and the channel.

We exploit the structure in the joint estimation problem using sparsity of the channel and the phase error vector in appropriate dictionaries. A compact representation of (5) can be obtained by following the same steps in [12], except for the diagonal matrix containing the phase errors. We use to denote the DFT of the phase error vector . With used to denote the noiseless unquantized version of in (5) such that , we have

 Z =L−1∑ℓ=0H[ℓ]TJℓdiag(d(ϵ,β)) =L−1∑ℓ=0UNrxC[ℓ]U∗NtxTJℓdiag(U∗Npb) =UNrx[C[0]C[1]C[2]...C[L−1]]Δ=C⎡⎢ ⎢ ⎢ ⎢ ⎢⎣U∗NtxTJ0U∗NtxTJ1⋮U∗NtxTJL−1⎤⎥ ⎥ ⎥ ⎥ ⎥⎦Δ=Fdiag(U∗Npb). (6)

The matrix is just a concatenation of the angle domain representation in (4) corresponding to each tap of the MIMO channel and is approximately sparse. Furthermore, the vector , i.e., the DFT of can be considered to be approximately sparse. The sparse representation of the phase error vector is valid in practice as the spread of the oscillator’s spectrum about the center frequency is relatively small compared to the bandwidth of the signal.

Now, we derive a sparse bilinear formulation for the joint estimation problem. Using (6), the quantized received block in (5) can be expressed as

 (7)

We define the vectors , and , and the vector . In vector notation, (6) can be written as

 z=diag(U∗Npb⊗aNrx(0))vec(UNrxCF). (8)

Notice that is just the all ones vector in dimension. Using the property , the received vector is expressed as

 y=Qq(diag(U∗Npb⊗aNrx(0))(FT⊗UNrx)c+v). (9)

We define the matrices and to rewrite (9) as

 y=Qq(diag(Gb)Ac+v). (10)

Estimating the phase errors and the channel is equivalent to estimating and from in (10). The joint estimation problem in (10) can be observed to be a noisy quantized bilinear problem in and , subject to the sparsity of and .

### Iii-B Limitations of existing techniques

#### Iii-B1 CFO robust methods

Existing sparse channel estimation methods that are robust to CFO discard the phase of the channel measurements. A hashing technique based non-coherent beam alignment algorithm was proposed in [17] for analog beamforming systems. This method, however, assumes fine control over the phase shifters, which is not necessarily the case with mmWave systems. The received signal strength (RSS) based method in [18] accounts for the limited phase control and uses pseudo-random phase shifts for compressive beam-training. The solutions in [17] and [18] assume a narrowband mmWave system and perform beam-alignment with just the magnitude of the channel measurements. These methods, however, cannot be used in one-bit receivers as energy detection with undithered one-bit ADCs is not feasible unless additional circuit components are used. For example, and are the same for any . As the only information provided by undithered one-bit ADCs is phase quantized to levels, discarding it due to phase errors leaves no information.

#### Iii-B2 Joint estimation using lifting

Lifting [21][30] is a convex relaxation technique that transforms a bilinear problem to a higher dimensional one and then recovers the original vectors by solving the higher dimensional problem. We describe the lifting technique applied to the joint estimation problem in (10). With denoting the number of entries in or , i.e., , the entry of can be given as

 zm =vecm(diag(Gb)Ac) (11) =G(m)bA(m)c (12) (13) =(A(m)⊗G(m))vec(bcT),∀m∈IM. (14)

We define a lifted variable and a measurement matrix , such that . Hence, the quantized measurements in (10) can be expressed as

 y=Qq(Φx+n). (15)

The lifted vector in (15) is sparse as it is just an outer product [20] of the sparse vectors and . Several CS-based algorithms [6][31] can be used to recover from the possibly under-determined noisy quantized system in (15). Using the SVD of the higher dimensional matrix estimate, the vectors in the bilinear problem can be estimated upto a scale factor.

Lifting followed by the SVD was applied to joint CFO and narrowband channel estimation for one-bit receivers in our previous work [19]. The main issue in extending our method in [19] to wideband systems arises due to the large dimensionality of the lifted problem. For instance, the dimension of to perform joint CFO and channel estimation in wideband systems would be . Using lifting necessarily implies solving for millions of variables for typical wideband mmWave systems, due to the large number of antennas and the need for additional pilots to compensate for the heavy quantization in low resolution systems.

The limitations of the existing phase error robust and joint estimation solutions in terms of architectural scalability and computational complexity, motivate the need to develop new low complexity joint estimation algorithms that can be applied to wideband systems and low resolution receivers.

## Iv Message passing based joint CFO and channel estimation

In this section, we give a brief introduction to PBiGAMP [22] and discuss its application to the joint estimation in (10). We exploit the inherent structure in our problem to derive a low complexity and memory efficient implementation of PBiGAMP for the joint estimation. Furthermore, we explain the CFO propagation effect induced by special training matrices that prevents further reduction in the computational complexity.

### Iv-a Introduction to PBiGAMP

The joint estimation problem in (10) can be solved using PBiGAMP [22] by considering , and of (8) and (10) as realizations of random vectors, say and . Let , , and be the elements of these random vectors. In general, deriving the closed form Minimum Mean-Squared Error (MMSE) estimates [32] of and is difficult as it requires marginalizing the joint PDF of and conditioned on . Using ideas from message passing, PBiGAMP can obtain the MMSE estimates of both the vectors in the bilinear problem.

We explain message passing using the factor graph [33] in Fig. 2, that shows the dependency between , the random variables (

) and their prior distributions. The circular nodes in the factor graph are called as variable nodes as they represent the random variables. The rectangular nodes in the factor graph are called as factor nodes and they contain the prior distribution of a random variable or the likelihood function associated with an observation. The messages in message passing are essentially probability distributions, also called as beliefs. The idea underlying message passing is to perform belief flows iteratively between the factors and the variables until all the variable nodes reach a consensus on their marginal probability distributions. As the factor graph for joint estimation is strongly connected, standard message passing can be computationally intractable.

Using ideas from Approximate Message Passing [10], PBiGAMP simplifies the messages by assuming a large number of variable nodes. Simulation results in [22] that show that PBiGAMP outperforms lifting techniques in Section III-B2, for IID Gaussian measurement matrices, motivate applying it to our problem. Furthermore, PBiGAMP performs optimization over the same number of variables in the problem, unlike lifting [21] that solves the problem in a higher dimensional space. For the joint estimation problem in typical wideband mmWave systems, PBiGAMP is memory efficient over lifting by several orders of magnitude.

### Iv-B PBiGAMP for joint estimation

In this section, we explicitly state PBiGAMP [22] for joint estimation in (10) and describe the information contained in the factor nodes of Fig. 2. To be consistent with the notation used in [22], we rewrite the random variable dependency corresponding to (11) in the tensor notation as

 zm=Nb∑i=1Nc∑k=1z(i,k)mbick, (16)

where , , and is an element of a third order tensor given by

 z(i,k)m=Gm,iAm,k. (17)

The output likelihood function in Fig. 2, denoted by is given by

 (18)

where

is the cumulative distribution function of the standard normal distribution. The sparsity of the vectors

and

is incorporated by assuming parametrized Bernoulli-Gaussian distributions for their priors

and . For simplicity, it is assumed that each entry of is independent of the other and identically distributed as . Similarly, the entries of are assumed to be IID, with as the distribution. Furthermore, the vectors and are assumed to be independent of each other. Let and denote the sparsity fraction of and . Let and be the variances of the coefficients corresponding to the non-zero support of the vectors and . With used to represent the Dirac-delta function, the Bernoulli-Gaussian distributions and can be given as

 pb(x) =λbδ(x)+(1−λb)N(0,σ2b), (19) pc(x) =λcδ(x)+(1−λc)N(0,σ2c). (20)

The parameters governing and , however, are not known apriori and can be learned by embedding PBiGAMP within the Expectation Maximization (EM) algorithm [11]. For a given set of likelihood functions and prior distributions, the vector variance PBiGAMP algorithm [22] to obtain the MMSE estimates of and in (10) is summarized in Table I.

The channel and the phase error vector can be derived using appropriate transformations over the PBiGAMP estimates. The vector obtained from PBiGAMP is just an estimate of the vectorized version of , the angle-delay domain representation of the wideband channel in (7). Let denote the angle-delay domain estimate of the wideband channel, such that . It can be seen from (6) that is just a concatenation of the angle domain representation of all the taps of the MIMO channel. Therefore, the tap of the antenna domain MIMO channel can be derived as

 ˆH[ℓ]=UNrx[ˆC(ℓNtx+1),ˆC(ℓNtx+2),...,ˆC(ℓNtx+Ntx)]U∗Ntx. (21)

The vector derived from PBiGAMP is an estimate of the DFT of the phase error vector . Estimating the CFO from

is just a single tone frequency estimation problem. As the phase noise is modelled as a Wiener process, we apply the Extended Kalman Filter (EKF)

[34] over the time domain samples, i.e., , to estimate .

In this work, we exploit the compressibility of the phase error vector in the DFT basis for joint estimation. It is possible, however, to incorporate the statistics of the phase noise by replacing the nodes corresponding to in Fig. 2 with the phase error variables and adding factors corresponding to the phase noise process. In such case, the message passing algorithm must handle the non-linear dependence of on the phase errors.

### Iv-C PBiGAMP: From theory to practice

As seen from Table I, the generic implementation of PBiGAMP is memory and computationally intensive, as it involves repeated operations over a third order tensor in (17). In this section, we provide insights into the key equations of Table I and describe our low complexity implementation for joint estimation.

The equations in PBiGAMP are essentially determined by the belief flows from the factor nodes to the variable nodes, and vice versa. Without loss of generality, we describe the message flow between the variable node and the factor node in Fig. 2. In standard message passing, the message sent from a factor node to essentially represents the PDF of presumed by that factor node. In a fully connected factor graph, the node receives messages from all the factor nodes () in addition to its prior distribution . As receives several beliefs from different factors, the belief sent by to is just a normalized product of all the beliefs received by except the one from . For generic priors and likelihood functions in the factor graph, performing standard message passing can be difficult as the belief flows are flows of PDFs that are functions.

PBiGAMP simplifies standard message passing using the central limit theorem (CLT) and the Taylor series approximation. Notice that

receives beliefs from all its neighbouring variable nodes and contains the likelihood function in itself. The belief sent from to is computed by multiplying the likelihood function, with all the incoming beliefs to except the one from , and then integrating over all the random variables except . The multiplication followed by integration essentially yields the PDF of presumed by the factor node . The integration is often multidimensional and can be difficult to compute. If depends on a large number of independent variable nodes through a linear function, then the linear combination can be approximated as a Gaussian random variable using the CLT [10]. In such case, the variable nodes can send just the mean and variances of the PDFs to the factors and this information is sufficient to compute the mean and variance of the Gaussian random variable, as seen in (R3) and (R5) of Table I. The message sent from to can now be computed by multiplying the likelihood at with the compound Gaussian PDF, and marginalizing the product with respect to . In general, the likelihood function can be non-linear in nature and can yield a complicated PDF of . Using a second order Taylor series expansion for the log PDF, PBiGAMP [10] simplifies the belief sent from to to a Gaussian whose mean and variance can be computed from (R7)-(R9) of Table I. For -bit ADCs, the closed form expressions for the conditional mean and variance in (R7) and (R8) can be found in [35, Appendix A]. Unlike standard message passing, PBiGAMP is computationally tractable as the messages contain just the mean and variances of the PDFs.

After several approximate message flows between the factor nodes and the variable nodes, PBiGAMP is expected to converge. The MMSE of is computed as the expectation of the effective marginal, i.e., the normalized product of all the Gaussian PDFs received by from the factor nodes and the prior distribution on . Notice that the normalization has to be done to ensure that the PDF integrates to 1. The expectation step for the MMSE of is given in (R18) using (R11) and (R12) as the intermediate steps. Similarly, the expectation for the MMSE of is given in (R16) using (R13) and (R14) as the intermediate steps. Thus, PBiGAMP provides estimates of the vectorized sparse channel and the DFT of the phase error vector.

The generic implementation of PBiGAMP is computationally expensive primarily due to (R1), (R2), (R5), (R12) and (R14) in Table I. It can be verified that each of these operations have a complexity of for a single PBiGAMP iteration to perform joint estimation. For instance, (R1) requires computing the scalar for every and , Therefore, (R1) demands computations, as each scalar computation requires multiplications. As , and for the joint estimation, (R1) has a complexity of for every PBiGAMP iteration. By exploiting the structure in the joint estimation problem, the complexity of PBiGAMP can be significantly reduced.

We describe our fast implementation of PBiGAMP for the joint estimation in the following sub-sections. The iteration variables of PBiGAMP are defined as , and , such that and are the and entries of and .

#### Iv-C1 Operations (R1)-(R3)

For an , we have

 ˆz(io,∗)m(t) =Nc∑k=1z(io,k)mˆck(t) (22) =Nb∑i=1Nc∑k=1z(i,k)meio,Npiˆck(t) (23) =vecm(UNrxˆC(t)Fdiag(U∗Npeio,Np)) (24) =vecm(UNrxˆC(t)Fdiag(U∗Np(io))), (25)

where the compact form in (25) is obtained by going back from the tensor formulation to the original bilinear model in (6). Likewise, for a , we have

 ˆz(∗,ko)m(t) =Nb∑i=1Nc∑k=1z(i,k)mˆbi(t)eko,Nck (26) (27) =vecm(UNrx(ro)F(co)diag(U∗Npˆb(t))), (28)

where correspond to the row and column of the matrix version of , with

 ko =(co−1)Nrx+ro. (29)

Similarly, (R3) can be computed as

 ˆz(∗,∗)m(t)=vecm(UNrxˆC(t)Fdiag(U∗Npˆb(t))). (30)

To evaluate (30), the product can be found using computations. Using the FFT over the resultant product, can be computed with an additional complexity of . Finally, the complexity to multiply the resultant matrix with the IFFT of is . Therefore, the computational complexity of (30) is , unlike of the generic implementation using (R1)-(R3).

#### Iv-C2 Operations (R4),(R5)

It can be noticed from (25) that is invariant with respect to and is given by

 ∣∣ˆz(i,∗)m(t)∣∣=1√Npvecm(∣∣UNrxˆC(t)F∣∣). (31)

Using the invariance property in (31), a compact version of the first summand in can be expressed as

 Np∑i=1vbi∣∣ˆz(i,∗)m(t)∣∣2=∑Npi=1vbiNpvecm(∣∣UNrxˆC(t)F∣∣2). (32)

For the second summand in (R4), it can be shown from (28) that

 ∣∣ˆz(∗,ko)m(t)∣∣ =1√Nrxvecm(∣∣F(co)diag(U∗Npˆb(t))∣∣⊗aNrx(0)). (33)

Furthermore, as denotes the column number corresponding to (see (29)), is invariant , where . To use this invariance for efficient computation of the second summand in (R4), we define a row vector containing the column-wise mean corresponding to matrix version of as

 μcn(t)=1Nrx∑vck(t)k∈InNrx∖I(n−1)Nrx. (34)

With some algebraic manipulation, the second summand in (R4) can be simplified as

 Nc∑k=1vck(t)∣∣ˆz(∗,k)m(t)∣∣2=vecm[(μc(t)∣∣Fdiag(U∗Npˆb(t))∣∣2)⊗aNrx(0)]. (35)

To simplify the computations involved in (R5), we expand the summand using (17) as

 Nb∑i=1Nc∑k=1vbi(t)vck(t)∣∣z(i,k)m∣∣2 =Nb∑i=1Nc∑k=1vbi(t)vck(t)|Gm,iAm,k|2 (36) =∑Npi=1vbi(t)NpNc∑k=1vck(t)∣∣ATk,m∣∣2, (37)

where (37) follows from (36) as . Besides, as , the entries of are invariant within blocks of size . With arguments similar to the simplifications involved in the second summand of (R4), (R5) can be efficiently evaluated as

 Nb∑i=1Nc∑k=1vbi(t)vck(t)∣∣z(i,k)m∣∣2=∑Npi=1vbi(t)Npvecm[(μc(t)|F|2)⊗aNrx(0)]. (38)

#### Iv-C3 Operations (R11),(R13)

From (33), it can be observed that is fixed for and , where and . The invariance of with respect to and arise because of the kronecker product with and the column invariance in (29). To exploit this property in computing of (R11), we construct a vector to contain the column-wise mean corresponding to the matrix version of , i.e.,

 μzk(t)=1Nrx∑vsm(t)m∈IkNrx∖I(k−1)Nrx. (39)

With the above definitions, a simplified version of in (R11) can be given as

 M∑m=1vsm(t)∣∣ˆz(∗,k)m∣∣2=veck[(∣∣Fdiag(U∗Npˆb)∣∣2μz(t))⊗aNrx(0)]. (40)

From (31), we rewrite in (R13) as

 (vqi(t))−1 =M∑m=1vsmvecm(∣∣UNrxˆC(t)F∣∣2)

#### Iv-C4 Operations (R12),(R14)

We define and consider the term in the second summand of (R12) for . From (28) and (29), we have

 M∑m=1ˆsm(t)ˆz(∗,ko)m(t)∗=M∑m=1ˆsm(t)vecm[¯¯¯¯¯UNrx(ro)¯¯¯¯¯¯¯¯¯¯ˆF(t)(co)]. (41)

With defined such that , (41) can be expressed as,

 M∑m=1ˆsm(t)ˆz(∗,ko)m(t)∗ =⟨ˆS(t),UNrx(ro)ˆF(t)(co)⟩ (42) =(UNrx(ro))∗ˆS(t)(ˆF(t)(co))∗ (43) =vecko(U∗NrxˆS(t)ˆF(t)∗). (44)

The term in the third summand of (R12) can be given by

 M∑m=1vsm(t)Np∑i=1vbi(t)∣∣z(i,k)m∣∣2 =M∑m=1vsm(t)∑Npi=