I Introduction
The design of advanced receiver algorithms is crucial to meet the stringent requirements of modern communication systems. Motivated by the successful application of the “turbo” principle in the decoding of channel codes, a large number of works have been devoted to the design of turbo receivers (see [1]
and the references therein). While in many of these works the receiver modules are individually designed and heuristically interconnected to exchange soft values, iterative receiver algorithms can be rigourously designed and better understood as instances of messagepassing inference techniques (e.g., see
[2]).In this context, variational Bayesian inference in probabilistic models
[3] have proven to be a very useful tool to design receivers where tasks like channel estimation, detection and decoding are jointly derived. Among the variational techniques, belief propagation (BP) [4, 5] has found the most widespread use. Originally applied to the decoding of channel codes, BP has been shown to be especially efficient in discrete probabilistic models. An alternative to BP is the mean field (MF) approximation and its messagepassing counterpart, usually referred to as variational messagepassing [6]. MF inference has been successfully applied to continuous probabilistic models involving probability density functions (pdfs) belonging to an exponential family, in which BP suffers from numerical intractability. Other notable examples of generalpurpose inference techniques are expectationmaximization (EM)
[7] and expectation propagation (EP) [8]. EM is a special case of MF, where the approximate pdfs – referred to as beliefs – are Dirac delta functions; EP can be seen as an approximation of BP where some beliefs are approximated by pdfs in a specific exponential family. Some attempts to find a unified framework encompassing all these techniques include the divergence interpretation in [9] and the regionbased free energy approximations in [10]. Following the latter approach, a novel hybrid messagepassing inference framework combining BP and the MF approximation was recently proposed in [11].In this paper, we investigate the design of receivers that perform joint channel estimation and data decoding in a generic communication system. For this purpose, we capitalize on the combined inference framework [11]
, which provides some degree of freedom in the choice of the parts of the factor graph in which either BP or MF is applied. We show that this framework can be modified to naturally embed EP, EM and BP with Gaussian approximation of some messages. Then, we apply these hybrid inference techniques to the underlying probabilistic model of the system and obtain four receiver algorithms, whose performance we assess by simulating a wireless system.
Notation: we denote by the cardinality of a finite set ; the relative complement of in is written as ; the set is denoted by
. Boldface lowercase and uppercase letters are used to represent vectors and matrices, respectively; superscripts
and denote transposition and Hermitian transposition, respectively. The Hadamard product of two vectors is denoted by . For a vector , we write ; for a matrix , denotes its th entry, is the matrix with the th row and th column deleted, denotes the column vector , and is the row vector. The pdf of a multivariate complex Gaussian distribution with mean
and covariance matrix is denoted by . We write when for some positive constant . We denote by the approximation of the pdf in the argument with a Gaussian pdf with the same mean and covariance matrix. The Dirac delta function is denoted by .Ii MessagePassing Inference Algorithms
We begin by concisely describing the unified messagepassing algorithm that combines the BP and MF approaches (refer to [11]). Then, we briefly show how other widespread inference algorithms can be obtained as particular instances or slight modifications of the unified framework.
Let be an arbitrary pdf of a random vector which factorizes as
(1) 
where is the vector of all variables that are arguments of the function .We have grouped the factors into two sets that partition : and . The factorization in (1) can be visualized in a factor graph [4] representation. We define to be the set of indices of all variables that are arguments of function ; similarly, denotes the set of indices of all functions that depend on . The parts of the graph that correspond to and to are referred to as “BP part” and “MF part”, respectively. We denote the variable nodes in the BP part by and those in the MF part by .
The combined BPMF inference algorithm approximates the marginals , by auxiliary pdfs called beliefs. They are computed as [11]
(2) 
with
(3) 
where and are constants that ensure normalized beliefs.
Belief propagation is obtained as a particular case of BPMF by setting , since in this case the expressions in (3) reduce to the BP message computations. Similarly, mean field is an instance of BPMF when .
Expectation propagation is very similar to BP, the main difference being that it constrains the beliefs of some variables to be members of a specific exponential family. Assuming Gaussian approximations of the beliefs, EP can also be integrated in the BPMF framework by modifying the messages
(4) 
for all , .
The expectationmaximization algorithm is a special case of MF when the beliefs of some variables are constrained to be Dirac delta functions [11]. Again, we include this approximation in the BPMF framework. This leads to for all and , where maximizes the unconstrained belief (2). We refer to this modified algorithm as BPEM.
Iii Probabilistic System Model
In this section, we present the signal model of our inference problem and its graphical representation. These will establish the baseline for the derivation of messagepassing receivers.
We analyze a system consisting of one transmitter and one receiver. A message represented by a vector of information bits is conveyed by sending data and pilot channel symbols having the sets of indices and , respectively, such that and . Specifically, vector is encoded and interleaved using a rate channel code and a random interleaver into the vector of length . For each , the subvector is mapped to a data symbol with , where is a discrete complex modulation alphabet of size . Symbols are multiplexed with pilot symbols , which are randomly selected from a QPSK modulation alphabet. Finally, the aggregate vector of channel symbols is sent through a channel with the following inputoutput relationship:
(5) 
The vector contains the received signal samples, is the vector of channel coefficients, and contains the samples of additive noise and has the pdf for some positive component precision . Note that (5) can model any channel with a multiplicative effect that is not affected by intersymbol interference, e.g., a timevarying frequencyflat channel or the equivalent channel in the frequency domain in a multicarrier system.
Based on the above signal model, we can state the probabilistic model which captures the dependencies between the system variables. The pdf of the collection of observed and unknown variables factorizes as
(6) 
where and incorporate the observations in and are given by
(7)  
(8) 
is the prior pdf of the vector of channel coefficients for which we set
(9) 
stand for the modulation mapping, accounts for the coding and interleaving operations and is the prior pmf of the th information bit. To obtain (6), we used the fact that is conditionally independent of and given , is independent of , and , the noise samples are i.i.d., and each data symbol is conditionally independent of all the other symbols given . The factorization in (6) can be visualized in the factor graph depicted in Fig. 1. The graph of the code and interleaver is not explicitly given, its structure being captured by .
Iv MessagePassing Receiver Schemes
In this section, we derive iterative receiver schemes by applying different inference algorithms to the factor graph in Fig. 1. The receiver has to infer the beliefs of the information bits using the observed vector and prior knowledge, i.e., the pilot symbols and their set of indices , the noise precision , the channel statistics in (9), the modulation mapping and the structure of the channel code and interleaver.
We set and (defined in Section II for a general probabilistic model) to be the sets of all factors and variables, respectively, contained in our probabilistic model. Next, we show that the BP algorithm resulting from setting yields messages of an intractable complexity. Assume that by running BP in the part of the graph containing the modulation and code constraints we obtain the messages
(10) 
with , where represent extrinsic information on symbol . These messages are further passed as . Then, for each , compute the message
(11) 
while for all set
(12) 
Note that the message in (11) is proportional to a mixture of Gaussian pdfs with components. Then, after setting for all and for all , the message from to reads
(13) 
Using (9), (11) and (12), the message in (13) becomes a Gaussian mixture with and components for and , respectively. Clearly, the computation of such messages is intractable and one has to use approximations.
Iva Algorithm based on BP combined with Gaussian approximation
Since the intractability of the messages occurs due to the Gaussian mixture in (11), we approximate those messages as proposed in [12], i.e., for each we set
(14) 
with
(15) 
In (15), we have defined the normalized amplitudes of the Gaussian mixture , where the constant ensures
. We also denote the mean and variance of the pdf in (
12) by and , , and we define the vector and the matrix with entries if and zero otherwise, for all .Now, using (9) and (14), the message in (13) becomes
(16) 
with
(17) 
These messages are further passed as extrinsic values, i.e., . For each , the following message is then computed:
After passing the extrinsic messages , , , we apply the BP update rule to compute the probabilities of the coded and interleaved bits (which is equivalent to MAP demapping), followed by BP decoding to obtain the beliefs of the information bits.
IvB Algorithm based on expectation propagation
We set and . The message computed with (4) is proportional to a Gaussian pdf; consequently, the EP rule for reduces to the BP rule and outputs a Gaussian pdf as in (IVA), since the operator is an identity operator for Gaussian arguments.
Specifically, using (3), (4), and then (11), (IVA), we have
for each , where
with
and , as in (17). Using (4) again, we obtain
with
(18) 
Unlike (15) in BP with Gaussian approximation, the values of and , , computed with (18) depend on all and , , , through (17). The parameters of are updated using (17) but with and computed as above. Note that all messages that depend on the channel coefficients need to be updated in a sequential manner. The rest of the messages are computed as in Section IVA.
IvC Algorithm based on the combined BPMF framework
The factor graph is split into the MF and BP parts by setting and . Such a splitting yields tractable and simple messages, takes advantage of the fact that BP works well with hard constraints and best exploits the correlation between the channel coefficients for the graphical representation in Fig. 1^{1}^{1}1Alternatively, the same level of exploitation of the correlation is obtained by representing the channel variables as a single vector variable and “moving” factor node to the MF part [11]..
Assuming we have obtained the messages (their expression will be given later), we can compute
where
with the definition and .
The messages are sent to the BP part and hence are extrinsic values. When computing we get the same expression as (IVA), with the parameters (17). Unlike in the previous algorithms, the following messages are beliefs, i.e., a posteriori probabilities (APP):
with
(19)  
Then, for all , we compute
(20) 
and we pass to the modulation and coding part of the graph as extrinsic values, for all . After running BP, we obtain (10) and then pass the following APP values back to the MF part:
IvD Algorithm based on BPEM
We now apply EM for channel estimation, so we constrain from the previous BPMF scheme to be Dirac delta functions. The resulting messages are the same as in the previous subsection, except for with computed as in (19). Note that this algorithm uses only point estimates of the channel weights; however, its complexity is basically still the same, since the computation of (19) actually includes the computation of the corresponding variance.
IvE Scheduling of message computations
All algorithms employ the same messagepassing scheduling: they start by sending messages corresponding to pilots and by initializing ; messages (computed according to the corresponding algorithm) are passed on up to the information bit variables – this completes the first iteration; each following iteration consists in passing messages up to the channel prior factor node and back; messages are passed back and forth until a predefined number of iterations is reached. All algorithms end by taking hard decisions on the beliefs of the information bits.
V Simulation Results
We consider a wireless OFDM system with the parameters given in Table I, and we evaluate by means of Monte Carlo simulations the bit error rate (BER) performance of the receiver algorithms derived in Section IV. We employ as a reference a scheme which has perfect channel state information (CSI), i.e., it has prior knowledge of the vector of channel coefficients .
We encountered numerical problems with the EPbased scheme due to the instability of EP in general, so we used the heuristic approach [9] to damp the updates of the beliefs with a stepsize . Also, the EPbased scheme has higher computational complexity than the others due to its message definition – it requires multiplication of a Gaussian pdf with a mixture of Gaussian pdfs, the approximation and division of Gaussian pdfs – and to the sequentiality of the message updates for the channel coefficients^{2}^{2}2For the other receiver schemes, it can be shown that the parameters of all messages with can be computed jointly and with a lower complexity..
Results in terms of BER versus signaltonoise ratio (SNR) are given in Fig. 2, while the convergence of the BER with the number of iterations is illustrated in Fig. 3. The receivers based on EP, combined BPMF and BPEM exhibit similar performance. They significantly outperform the receiver employing BP with Gaussian approximation. Note that even with a high pilot spacing the performance of the former algorithms is close to that of the receiver having perfect CSI. These three algorithms converge in about 10–12 iterations, while BP with Gaussian approximation converges a little faster, but to a higher BER value. Other results not presented here show that for a higher pilot density the algorithms converge faster, as expected.
Note that the results for the (essentially equallycomplex) BPEM and BPMF receivers are nearly identical, even if the former discards the soft information in channel estimation. We noticed during our evaluations that even at low SNR values, so our explanation would be that accounting for in the BPMF receiver does not have a noticeable impact on the detection (20).
Parameter  Value 

Subcarrier spacing  
Number of active subcarriers  
Number of evenly spaced pilot symbols  
Pilot spacing  
Modulation scheme for data symbols  
Convolutional channel code  
Multipath channel model  3GPP ETU 
Coherence bandwidth of the channel 
Vi Conclusions
We formulated the problem of joint channel estimation and decoding in a communication system as inference in a graphical model. To solve the inference problem, we resorted to a recently proposed messagepassing framework that unifies the BP and MF algorithms and includes them as particular instances. Additionally, we illustrated how the combined framework can encompass the EP and EM inference algorithms.
Based on the inference techniques considered, we derived four receiver algorithms. Since BP is not suitable for the studied problem, as it leads to intractable messages, we applied its variant which employs Gaussian approximation of the computationally cumbersome messages instead. However, our results showed that it performs significantly worse than the other proposed schemes. Considering the BER results, the computational complexity and stability of these schemes, we conclude that the receiver based on the combined BPMF framework and its BPEM variant are the most effective receiver algorithms.
Acknowledgment
Six projects have supported this work: the Project SIDOC under contract no. POSDRU/88/1.5/S/60078; the Cooperative Research Project 4GMCT funded by Intel Mobile Communications, Agilent Technologies, Aalborg University and the Danish National Advanced Technology Foundation; the PhD Project “Iterative Information Processing for Wireless Receivers” funded by Renesas Mobile Corporation; the Project ICT248894 WHERE2; the WWTF Grant ICT10066; and the FWF Grant S10603N13 within the National Research Network SISE.
References
 [1] M. Tüchler and A. C. Singer, “Turbo equalization: An overview,” IEEE Transactions on Information Theory, vol. 57, no. 2, pp. 920–952, 2011.

[2]
J. Boutros and G. Caire, “Iterative multiuser joint decoding: unified framework and asymptotic analysis,”
Information Theory, IEEE Transactions on, vol. 48, no. 7, pp. 1772 –1793, jul 2002. 
[3]
M. J. Wainwright and M. I. Jordan, “Graphical models, exponential families,
and variational inference,”
Foundations and Trends in Machine Learning
, vol. 1, pp. 1–305, 2008.  [4] F. Kschischang, B. Frey, and H.A. Loeliger, “Factor graphs and the sumproduct algorithm,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 498–519, Feb. 2001.
 [5] H.A. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. Kschischang, “The factor graph approach to modelbased signal processing,” Proc. IEEE, vol. 95, no. 6, pp. 1295–1322, Jun. 2007.
 [6] J. Winn and C. Bishop, “Variational message passing,” Journal of Machine Learning Research, vol. 6, pp. 661–694, 2005.
 [7] A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 39, no. 1, pp. 1–38, 1977.
 [8] T. Minka, “Expectation propagation for approximate bayesian inference,” in Proc. 17th Conf. on Uncertainty in AI, 2001, pp. 362–369.
 [9] ——, “Divergence measures and message passing,” Microsoft Research, Tech. Rep., 2005.
 [10] J. Yedidia, W. Freeman, and Y. Weiss, “Constructing freeenergy approximations and generalized belief propagation algorithms,” IEEE Trans. Inform. Theory, vol. 51, no. 7, pp. 2282–2312, July 2005.
 [11] E. Riegler, G. E. Kirkelund, C. N. Manchón, M.A. Badiu, and B. H. Fleury, “Merging belief propagation and the mean field approximation: A free energy approach,” accepted for publication in IEEE Trans. Inform. Theory, 2012, arXiv:1112.0467v2[cs.IT].
 [12] Z. Shi, T. Wo, P. Hoeher, and G. Auer, “Graphbased soft iterative receiver for higherorder modulation,” in Proc. 12th IEEE Int. Conf. on Comm. Tech. (ICCT), Nanjing, China, Nov. 2010, pp. 825–828.