The main impairments to a communication system, whether wireless or wireline, are due to multipath propagation through a physical medium and additive noise. Multipath propagation is commonly modeled as a linear convolution that, in the slow-fading scenario, can be characterized by a channel impulse response that is fixed over the duration of one codeword. In the well-known “uncorrelated Rayleigh/Ricean-fading” scenario, the (complex-baseband) channel “taps”
are modeled as independent circular Gaussian random variables, and in the well known “additive white Gaussian noise” (AWGN) scenario, the time-domain additive noise samplesare modeled as independent circular Gaussian random variables .
In this work, we focus on applications where the uncorrelated-Rayleigh/Ricean-fading assumption holds but the AWGN assumption does not. Our work is motivated by extensive measurement campaigns of terrestrial wireless installations wherein the additive noise is impulsive, with peak noise amplitudes reaching up to dB above the thermal background noise level [2, 3, 4, 5, 6, 7]. The noise affecting powerline communications (PLC) has also been shown to be highly impulsive, as well as bursty [8, 9, 10].
We restrict our attention to systems employing (coded or uncoded) orthogonal frequency division multiplexing (OFDM) , as used in many modern cellular wireless standards (e.g., IEEE802.11n and LTE) and PLC standards (e.g., PRIME and IEEE1901). OFDM is advantageous in that it facilitates data communication across convolutive multipath channels with high spectral efficiency and low complexity.
The impulsivity of noise has particular consequences for OFDM systems. Recall that, in conventional OFDM receivers, the time-domain received signal is converted to the frequency domain through a discrete Fourier transform (DFT), after which each subcarrier (or “tone”) is demodulated independently. Such tone-by-tone demodulation is in fact optimal with AWGN and perfect channel estimates , and is highly desirable from a complexity standpoint, since it leaves the DFT as the primary source of receiver complexity, and thus requires only multiplies per symbol for tones. When the time-noise is impulsive, however, the corresponding frequency-domain noise samples will be highly dependent, and tone-by-tone demodulation is no longer optimal. We are thus strongly motivated to find near-optimal demodulation strategies that preserve the complexity of classical OFDM. In this work, we propose one such solution that exploits recent breakthroughs in loopy belief propagation.
I-B Prior Work
I-B1 OFDM Reception in Impulsive Noise
One popular approach to OFDM reception in impulsive noise stems from the argument that the noiseless time-domain received OFDM samples can be modeled as i.i.d Gaussian (according to the central limit theorem with sufficiently many tones), in which case the noise impulses can be detected using a simple threshold test. This approach straightforwardly leads to a decoupled procedure for impulse mitigation and OFDM reception: the time-domain received signal ispre-processed via clipping or blanking techniques [11, 12] or (nonlinear) MMSE estimation , and the result passed to a conventional DFT receiver for decoding. While agreeable from a complexity standpoint, these techniques perform relatively poorly, especially when the power of the implusive noise is comparable to the power of the OFDM signal, or when higher order modulations are used . This loss of performance can be explained by the fact that OFDM signal structure is not exploited for noise mitigation. In an attempt to improve performance, it has been suggested to iterate between such pre-processing and OFDM decoding, but the approaches suggested to date (e.g., [14, 15, 16, 17]) have shown limited success, mainly because the adaptation of preprocessing with each iteration is challenging and often done in an ad-hoc manner.
Another popular approach models the time-domain impulsive noise sequence as a sparse vector and then usessparse-reconstruction techniques to estimate this sequence from the observed OFDM null and pilot (i.e., known) tones. The recovered impulse vector is then subtracted from the time-domain received signal, and the result is passed to a conventional DFT receiver for decoding. Algebraic techniques were proposed in [18, 19, 20], and sparse reconstruction techniques based on compressive-sensing were proposed in [21, 22]. With typical numbers of known tones, these techniques have been shown to work well for very sparse impulsive noise sequences (e.g., one impulse in a -tone OFDM system with known tones) but not for practical sparsity rates [21, 23].
A more robust approach was proposed in , which performs joint symbol detection and impulse-noise estimation using sparse Bayesian learning (SBL). Still, because  decouples channel estimation from impulse-noise estimation and symbol detection, and because it integrates coding in an ad-hoc manner, there is considerable room for improvement. In addition, it performs matrix inversion that is impractical for typical OFDM receivers with hundreds of tones.
I-B2 Factor Graph Receivers
Factor-graph-based receivers  have been proposed as a computationally efficient means of tackling the difficult task of joint channel, symbol, and bit (JCSB) estimation. Here, messages (generally in the form of pdfs) are passed among the nodes of the factor graph according to belief propagation strategies like the sum-product algorithm (SPA) . Due to the loopy nature of the OFDM factor graph, however, exact implementation of the sum product algorithm is infeasible, and so various approximations have been proposed [26, 27, 28, 29]. Notably,  merged the “generalized approximate message passing” (GAMP) algorithm  with a soft-input soft-output decoder in a “turbo” configuration to accomplish near-optimal111The approach was shown to be near-optimal in the sense of achieving  the pre-log factor of the sparse channel’s noncoherent capacity . joint structured-sparse-channel estimation and decoding of bit-interleaved coded-modulation (BICM)-OFDM with complexity. To our knowledge, no factor-graph-based OFDM receivers have been proposed to tackle impulsive noise, however.
In this paper, we propose a novel OFDM receiver that performs near-optimally in the presence of impulsive noise while maintaining the complexity order of the conventional -tone OFDM receiver. Our approach is based on computing joint soft estimates of the channel taps, the impulse-noise samples, the finite-alphabet symbols, and the unknown bits. Moreover, all observed tones (i.e., pilots, nulls, and data tones) are exploited in this joint estimation. To do this, we leverage recent work on “generalized approximate message passing” (GAMP) , its “turbo” extension to larger factor graphs , and off-the-shelf soft-input/soft-output (SISO) decoding . The receiver we propose can be categorized as an extension of the factor-graph-based receiver  that explicitly addresses the presence of impulsive noise. The resulting receiver provides a flexible performance-versus-complexity tradeoff and can be parallelized, making it suitable for FPGA implementations.
I-D Organization and Notation
In Section II, we describe our OFDM, channel, and noise models, and provide an illustrative example of impulsive noise. Then, in Section III, we detail our proposed approach, which we henceforth refer to as joint channel, impulse, symbol, and bit (JCISB) estimation. In Section IV we provide extensive numerical results, and in Section V we conclude.
Notation: Vectors and matrices are denoted by boldface lower-case () and upper-case notation (), respectively. then represents the sub-matrix constructed from rows and columns of , where the simplified notation means and “” indicates all columns of . The notations and
denote transpose and conjugate transpose, respectively. The probability density function (pdf) of a random variable (RV)is denoted by , with the subscript omitted when clear from the context. Similarly, for discrete RVs, the probability mass function (pmf) is denoted by . For a circular Gaussian RV with mean
and variance, we write the pdf as . The expectation and variance of a RV are then given by and , respectively. We use the sans-serif font to indicate frequency domain variables like and bold sans-serif to indicate frequency domain vectors like .
Ii System Model
Ii-a Coded OFDM Model
We consider an OFDM system with tones partitioned into pilot tones (indexed by the set ), null tones (indexed by the set ), and data tones (indexed by the set ), each modulated by a finite-alphabet symbol chosen from an -ary constellation . The coded bits (which determine the data symbols) are generated by encoding information bits using a rate- coder, interleaving them, and allocating the resulting bits among an integer number of OFDM symbols.
In the sequel, we use with to denote the th element of , and to denote the corresponding bits as defined by the symbol mapping. Likewise, we use to denote the scalar symbol transmitted on the th tone of the th OFDM symbol. Based on the tone partition, we note that: for all , where is a known pilot symbol; for all ; and for some such that for all , where denotes the coded/interleaved bits corresponding to . On the frame level, we use to denote the coded/interleaved bits allocated to the data tones of the th OFDM symbol, and to denote the entire codeword obtained from the information bits by coding/interleaving. Similarly, we use to denote the th OFDM symbol’s tone vector, including pilot, null, and data tones.
For OFDM modulation, an inverse of the unitary -point discrete Fourier transform (IDFT) matrix is applied to the th OFDM symbol’s tone vector , producing the time-domain sequence , to which a cyclic prefix is prepended. The resulting sequence propagates through an -tap linear-time-invariant channel with impulse response before being corrupted by both AWGN and impulsive noise. Assuming a cyclic prefix of length , inter-symbol interference is avoided by simply discarding the cyclic prefix at the receiver, after which the remaining samples are
where is the time-domain noise vector and is the circulant matrix formed by . Applying a DFT, the resulting frequency-domain received vector becomes
where is the frequency-domain channel vector, is the frequency-domain noise vector, and denotes the Hadamard (i.e., elementwise) product. The second equality in (2) follows from the fact that a circulant matrix is diagonalized by the Fourier basis. In fact, (2) illustrates the principal advantage of OFDM: each transmitted tone experiences a flat scalar subchannel, since
Ii-B Channel Modeling
We assume that the channel taps remain constant during the entire duration of one OFDM symbol, as required by (2). Since we make no assumptions on how the taps change across symbols, for simplicity we take and to be statistically independent for . Furthermore, we use the Rayleigh-fading uncorrelated-scattering model
Ii-C Impulsive Noise Models
In many wireless and power-line communication (PLC) systems, the additive noise is non-Gaussian (see the example in Fig. 1) and the result of random emission events from uncoordinated interferers (due to, e.g., aggressive spectrum reuse) or non-communicating electronic devices. In his pioneering work, Middleton modeled these random spatio-temporal emissions, or the “noise field,” using Poisson point processes (PPP), giving rise to the “Middleton class-A” and “Middleton class-B” noise models. (For a recent review see .) Recently, his approach has been extended to modeling fields of interferers in wireless and PLC networks using spatial and temporal PPPs, and the resulting interference was shown to follow either the Symmetric alpha stable, the Middleton class-A (MCA), or the more general Gaussian mixture (GM) distribution, depending on the network architecture [37, 38, 9, 10]. Fig. 1 illustrates that a GM model provides a significantly better fit to a noise realization collected from a receiver embedded in a laptop than a Gaussian model does.
Since our factor-graph-based receiver is inherently Bayesian, these statistical models provide natural priors on the noise. Thus, we model the additive noise using a GM model, noting that—given the pdf parameters—there is no distinction between the MCA and GM models. In particular, we decompose222Our approach is equivalent to modeling the total noise by a GM pdf with and for . a given time-domain noise sample into a Gaussian background component and a sparse impulsive component with Bernoulli-GM pdf
where denotes the Dirac delta and . Equivalently, we can model the (hidden) mixture state of the impulsive component as a random variable, giving rise to the hierarchical model (with )
In many applications, such as PLC, the noise is not only impulsive but also bursty
and thus the noise samples are no longer statistically independent. Such burstiness can be captured via a Bernoulli-Gaussian hidden Markov model (BGHMM) on the impulse-noise[8, 39] or equivalently a Markov model on the GM state in (6). For this, we model the sequence as a homogeneous (stationary)
-ary Markov chain with a state transition matrixsuch that
In this case, the marginal pmf of steady-state obeys , and the mean duration of the event is .
As an illustrative example, Fig. 2 plots two realizations of the total noise with impulsive component generated by the hierarchical Bernoulli-GM model (6). Both realizations have identical marginal statistics: their impulsive components have two non-trivial emission states with powers dB and dB above the background noise power that occur and of the time, respectively. However, in one case the emission state was generated i.i.d whereas in the other case it is generated Markov with state-transition matrix
The GHMM realization clearly exhibits bursty behavior.
Iii Message-Passing Receiver Design
In this section, we design computationally efficient message-passing receivers that perform near-optimal bit decoding which, as we shall see, involves jointly estimating the coded bits, finite-alphabet symbols, channel taps, and impulsive noise samples. In doing so, our receivers exploit knowledge of the statistical channel and noise models discussed above and the OFDM signal structure (i.e., the pilot and null tones, the finite-alphabet symbol constellation, and the codebook).
Iii-a MAP Decoding of OFDM in Impulsive Noise
Maximum a posteriori (MAP) decoding, i.e.,
is well known to be optimal in the sense of minimizing the bit-error rate. Here,
collects the received OFDM symbols of the corresponding frame. Using the law of total probability, we can write the posterior information-bit probability from (9) as
where “” denotes equality up to a constant, , and the information bits are assumed to be independent with . Equation (12) shows that optimal decoding of involves marginalizing over the finite-alphabet symbols , coded bits , noise states , impulse noise samples , channel taps , and other info bits .
The probalistic structure exposed by the factorization (12) is illustrated by the factor graph in Fig. 3. There and in the sequel, for brevity, we drop the “” (i.e., OFDM symbol) index when doing so does not cause confusion.
Clearly, direct evaluation of from (12) is computationally intractable due to the high-dimensional integrals involved. Belief propagation (BP), and in particular the sum-product algorithm (SPA)  described below, offers a practical alternative to direct computation of marginal posteriors. In fact, when the factor graph has no loops, the SPA performs exact inference after only two rounds of message passing (i.e., forward and backward). On the other hand, when the factor graph is loopy, the computation of exact marginal posteriors is generally NP hard  and thus the posteriors computed by BP are generally inexact. Nevertheless, loopy BP has been successfully applied to many important problems, such as multi-user detection [41, 42], turbo decoding , LDPC decoding , compressed sensing [44, 30], and others.
In fact, for certain large densely loopy graphs that arise in the context of compressed sensing, SPA approximations such as the AMP  and GAMP  algorithms are known to obey a state evolution whose fixed points, when unique, yield exact posteriors [45, 46]. Looking at the factor graph in Fig. 3, we see densely loopy sub-graphs between the factors and the time-domain noise samples and channel taps , which are due to the linear mixing of the Fourier matrix . It is these types of densely loopy graphs for which AMP and GAMP are designed.333Although rigorous GAMP guarantees have been established only for generalized linear inference problems with i.i.d sub-Gaussian transform matrices , equally good performance has been empirically observed over much wider classes of matrices . In the sequel, we will describe exactly how we combine the sum-product and GAMP algorithms for approximate computation of the bit posteriors in (12). First, however, we review the SPA.
Iii-B Belief Propagation using Sum-Product Algorithm
Belief propagation (BP) transforms a high-dimensional marginalization problem (like (12)) into a series of local low-dimensional marginalization problems by passing beliefs, or messages, which usually take the form of (possibly un-normalized) pdfs or pmfs, along the edges of a factor graph. The sum-product algorithm (SPA)  is probably the best known approach to BP, and it operates according to the following rules:
Iii-B1 Messages from Factor Nodes to Variables
Suppose the pdf factor depends on the variables . Then the message passed from factor node to variable node is
representing the belief that node has about the variable .
Iii-B2 Messages from Variables to Factor Nodes
Suppose the factors all involve the variable . Then the message passed from variable node to factor node is
and represents the belief about the variable that node passes to node .
Iii-B3 Marginal Beliefs
SPA’s approximation to the marginal posterior pdf on the variable is
where is a normalization constant.
Iii-C Joint Channel/Impulse-Noise Estimation and Decoding
We now propose a strategy to approximate the bit posteriors in (12) by iterating (an approximation of) the SPA on the loopy factor graph in Fig. 3. To distinguish our approach from others in the literature, we will refer to it as “joint channel, impulse, symbol, and bit estimation” (JCISB).
Given the loopy nature of the factor graph, there exists considerable freedom in the message-passing schedule. In JCISB, we choose to repeatedly pass messages from right to left, and then left to right, as follows.
Beliefs about coded bits flow rightward through the symbol-mapping nodes , the finite-alphabet symbol nodes , and into the factor nodes .
GAMP-based messages are then passed repeatedly between the and nodes until convergence.
GAMP-based messages are passed repeatedly between the and nodes until convergence, and then through the nodes using the forward-backward algorithm, alternating these two steps until convergence.
Finally, the messages are propagated from leftward through the symbol nodes , the symbol-mapping nodes , the coded-bit nodes , and the coding-interleaving block—the last step via an off-the-shelf soft-input/soft-output (SISO) decoder.
In the sequel, we will refer to steps 1)–4) as a “turbo” iteration, and to the iterations within step 3) as “impulse iterations,” We note that it is also possible to execute a parallel schedule if the hardware platform supports it. The details of these four message passing steps are given below.
Iii-C1 Bits to Symbols
Beliefs about the coded bits (for each data tone ) are first passed through the symbol mapping factor node . The SPA dictates that
where (III-C1) follows from the deterministic symbol mapping . The resulting message is then copied forward through the node, i.e., , also according to the SPA. Note that, at the start of the first turbo iteration, we have no knowledge of the bits and thus we take to be uniform across for all .
Iii-C2 GAMP for Channel Estimation
The next step in our message-passing schedule is to pass messages between the factor nodes and the time-domain channel nodes . According to the SPA, the message passed from to is
Exact evaluation of (III-C2) involves an integration of the same high-dimensionality that made (12) intractable, with exponential complexity in . Thus, we instead approximate the message passing between the and nodes using generalized approximate message passing (GAMP) algorithm  reviewed in Appendix A and summarized in Table I.
To do this, we temporarily treat the messages and as fixed, allowing us to employ “,” using the notation established in the caption of Table I. Definition (D1) [later used in steps (R3)–(R4)] requires us to specify the likelihood relating the transform output to the corresponding observed output . From Fig. 3, we see that there are two types of belief flowing into each node (apart from beliefs about ) that determine this likelihood: beliefs about the symbols , which we parameterize as with , and beliefs about the frequency-domain impulsive noise , which GAMP approximates as , where the values were computed by in the previous turbo iteration.444During the first turbo iteration, we use and . Here, refers to the impulsive component of the frequency-domain noise , with , so that [from (2)]
also requires us to derive the “input MMSE estimation functions” and for GAMP steps (R9)–(R10). Given the channel model specified in Section II-B and definition (D2), it is straightforward to show  that the input MMSE estimation functions are and .
After is iterated to convergence, the outputs and of steps (R4)–(R3) are close approximations to the marginal posterior mean and variance, respectively, of . These outputs will be used in the next step of the message-passing schedule, as described below. Similarly, the outputs and of steps (R10)–(R9) are close approximations to the marginal posterior mean and variance, respectively, of .
Iii-C3 Turbo-GAMP for Noise Estimation
The next step in our schedule is to pass messages between the factor nodes , the time-domain impulse-noise nodes , and the noise-state nodes . According to the SPA, the message passed from to is
Although GAMP can help approximate the messages in (III-C3), GAMP alone is insufficient due to connections between the nodes, which are used to model the burstiness of the time-domain impulse-noise . However, recognizing that the underlying problem is estimation of a clustered-sparse sequence from compressed linear measurements, we can use the solution proposed in , which alternated (G)AMP with the forward-backward algorithm , as described below.
First, by temporarily treating the messages , , and as fixed, we can apply under the likelihood model
implied by (5). In (18), are the symbol beliefs coming from the nodes and are the frequency-domain channel estimates previously calculated by . Meanwhile, in (19), represents the pmf on the noise state that is set as . The resulting output MMSE estimation functions, derived in Appendix C, are listed in TABLE II, and the input MMSE estimation functions are
Here, is the posterior pmf for noise-state