## I Introduction

Generalized approximate message passing (GAMP) proposed by Rangan [2, 1] is a generalization of approximate message passing (AMP), independently described by Donoho et al. [3]

. The GAMP allows general measurement channels (including non-linear channels) to be used. Due to its Bayes optimality as well as low computational complexity, and more importantly, asymptotical accuracy of state evolution (SE), GAMP has attracted more and more attention in domains like compressive sensing, image processing, Bayesian learning, statistical physics, low-rank matrix estimation, mmWave channel estimation, spatial modulation, user activity and signal detection in random access, orthogonal frequency division multiplexing analog-to-digital converters system, sparse superposition codes, etc

[11, 10, 9, 8, 7, 6, 5, 4].The original AMP and GAMP are derived via belief propagation (BP) based on the central limit theorem (CLT) and Taylor Series. Expectation propagation (EP) [12, 13]

is an alternative message passing rule that deals with general non-Gaussian probability distribution functions (PDFs). EP projects the

*a-posteriori*

estimation on a Gaussian distribution with moment matching, and thus obtains a similar message update rule as Gaussian message passing (GMP)

[14, 15, 16, 17, 18]. The potential connection between AMP and EP was first shown in [19, 20], in which the fixed points of EP and AMP were shown to be consistent. An EP-based AMP was proposed in [21]. Recently, Ma and Ping proposed an orthogonal AMP for general unitarily-invariant measurement matrices, and showed that the optimal MMSE OAMP is equivalent to MMSE EP [22, 24, 23]. These works hint at the conceptual equivalence between EP and AMP. In [25], Meng et al. first gave a rigorous derivation of AMP based on a dense graph-based EP by making some approximations in large system limit. Based on the results in [25], the authors further provided a unified Bayesian inference framework for the extension of AMP and VAMP to the generalized linear model

[26, 27]. Another form of EP-based derivation for MMSE GAMP was illustrated in [28]. More recently, the connection between EP and the max-sum GAMP was built in [29].In [2, 1, 3], the authors used Taylor expansion and second-order approximation for the non-linear constraints of the general measurement channel. In this paper, we adopt a different approach, in which the general non-linear constraints are solved by an easily understandable EP rule, which has the same form as the GMP rule (for the linear constraints). The only difference between EP and GMP is that the *a-posteriori* calculation is replaced by a non-linear MMSE estimation, which makes EP more efficient in solving the non-linear problems than GMP. As a result, the whole general measurement problem is solved by the unified “GMP-like” rule. By neglecting the high-order infinitesimal terms, the EP-MPA is proven to be equivalent to GAMP. Furthermore, for additive white Gaussian noise (AWGN) measurement channels, the EP-MPA is proven to be equivalent to AMP. These results offer a new insight into GAMP and AMP, and may provide hints to build new MPAs for more general non-linear networks.

## Ii Problem Formulation

GAMP considers a system given in Fig. 1, where , , and are subjected to a linear function , and and are subjected to symbol-wise transfer probability function and respectively. In addition, has i.i.d. Gaussian components . The goal of GAMP is to iteratively recover and given and , which is equivalent to estimate the marginal probability below

(1a) | |||

(1b) |

where is a Dirac delta function. However, exact calculation of (1) has intractable complexity for large scale problems.

For more general with finite , we can rewrite the system to , where and . Then, all the results in this paper are still valid by replacing with . For example, if , we replace by to make the results of this paper be valid.

## Iii EP-Based Message Passing Algorithm

Fig. 2 gives a Forney-style factor graph of the system in (1), where edges denote variables, and nodes denote the related constraints: , , and . MPA [14] is a method to iteratively compute the marginal probability. Since the high-dimensional integration is distributively calculated by local message passing, it has a low complexity. Next, we briefly introduce EP [12, 13].

### Iii-a Expectation Propagation

###### Definition 1

Let the *a-priori* message be with , and a constraint of . EP updates

(2a) | ||||

(2b) |

where and .

By letting , , , , and , it is easy to verify that (2) is consistent with that in [12] (see Eqs. 3.32-3.34). The form in (2) has also been widely used for EP [30, 24].

*Relation to Standard GMP:* In fact, when the constraint is a linear and Gaussian^{1}^{1}1For example, is a Gaussian constraint of , and (given , and the distribution of ) is a linear constraint of ., EP in (2) is the exact GMP. For example, if is a Gaussian constraint , the *a posteriori probability* is Gaussian and given by

(3a) | ||||

(3b) | ||||

(3c) |

where

(4a) | ||||

(4b) |

which can be rewritten to

(5a) | ||||

(5b) |

GMP [14, 15, 16, 17] follows the well-known extrinsic message passing (EMP), named Turbo principle, where the output does not involve the input , i.e.,

(6a) | ||||

(6b) |

From (4), (6) is the same as (2). Hence, GMP is an instance of EP. In Turbo, there is a famous “information equation”:

(7) |

That is, the information contained in the *a-posteriori* message is equal to the sum information contained in the *a-priori* message and the extrinsic message. This principle has been widely used in modern channel coding and sum-product algorithm. For example, the extrinsic message can be calculated by removing the *a-priori* message from the *a-posteriori* message.

If is non-Gaussian, EP in (2) is not equal to GMP, i.e., (2) and (6) are not equivalent, i.e., “information equation” in (7) does not hold any more. In general, EP could provide more useful information than EMP (or Turbo) for non-Gaussian , i.e., the following “information inequality” holds:

(8) |

which implies that “EP” outperforms “Turbo”. For more details, refer to [30, 31].

*Intuition of EP:* In general, the *a posteriori probability* (APP) estimation is the optimal local estimation since it fully exploits the *a-priori* (or input) message, but it will cause correlation problem in the iterative process. To avoid the correlation problem in the iteration, Turbo principle discards the *a-priori* message in the estimation, but this results in performance loss since the *a-priori* message is not exploited. EP makes a good tradeoff between the APP and Turbo, i.e., the *a-priori* message is partly used to improve the estimation and the correlation problem is also avoided. Due to these reasons, EP could have a better performance than EMP.

Fig. 3 shows the message passing illustration for the problem, where be the messages (mean and variance for ) passing from VN to XCN, for from XCN to VN, for from VN to SN, for from SN to VN, and for from SN to ZCN, and for from ZCN to SN. Next, we derive the message passing algorithm based on the expectation propagation principle under a unified “GMP-like” rule.

### Iii-B A Unified “GMP-like” EP-MPA

Fig. 3 illustrates the EP-MPA, where ZCN and XCN denote the constraint nodes of and respectively. The message updates at variable node (VN) and sum node (SN) are GMP, while ZCN and XCN are EP.

*Step I (SN ZCN):*
Since and , from central limit theorem (CLT), we have , where

(9a) |

with initialization , .

*Step II (ZCN SN):*
Message update at ZCN for uses EP with constraints and :

(10a) | ||||

(10b) |

where .

*Step III (SN VN):*
The constraints at -th SN are , and . Message update at SN for VN are:

(11a) | ||||

(11b) |

where and .

*Step IV (VN XCN):* The constraints at -th VN are . Message update at VN are:

(12a) | ||||

(12b) |

where .

*Step I*: For each compute:

*Step II*: For each ,

*Steps III and IV*: For each and ,

*Steps V and VI*: For each and ,

*Step V (XCN VN):* Message update at XCN to VN for uses EP with constraints and , i.e., for each ,

(13a) | ||||

(13b) |

where .

*Step VI (VN SN):* The constraints at -th VN are and . Message update at VN for SN are:

(14a) | ||||

(14b) |

where and .

We abandon the auxiliary variables , and have

(15a) | ||||

(15b) |

Therefore, we obtain a unified “GMP-like” EP-MPA, and the above steps are summarized in Algorithm 1.

## Iv Equivalence between EP and GAMP/AMP

The equivalence of EP and AMP is firstly derived in [25], based on which [26, 27] further proposed a unified Bayesian inference framework for the extension of AMP and VAMP to the generalized linear model. Another form of EP-based derivation for MMSE GAMP was illustrated in [28]. In [29], the max-sum GAMP was built by EP. In this section, we derive the MMSE GAMP and MMSE AMP with some approximations on the unified “GMP-like” EP-MPA in Algorithm 1.

### Iv-a Connection with GAMP

For simplicity, we define

(16a) | ||||

(16b) |

###### Proposition 1

For , we have

(17a) |

###### Proof:

First, we have since the *a-priori* message does not increase the conditional variance. In addition, from the symmetry of the system, . Therefore, we have .

###### Proposition 2

Message update in (12) can be rewritten as

(18a) | ||||

(18b) | ||||

where | ||||

(18c) | ||||

(18d) |

###### Proof:

See APPENDIX A.

###### Proposition 3

Message update (12) can be rewritten as

(19a) | |||

###### Proof:

See APPENDIX B.

According to Propositions 1-3, the auxiliary variables and can be abandoned, and EP-MPA 1 to can be rewritten to the MMSE GAMP in Algorithm 2. Therefore, we have the following lemma.

###### Lemma 1

EP-MPA is equivalent to MMSE GAMP.

For balance systems, we have and . Therefore, the MMSE GAMP can be further simplified to

(20a) | |||

(20b) |

where and .

### Iv-B Connection with AMP

In AMP, from , we have

Thus,

(22) |

###### Lemma 2

EP-MPA can be rewritten to AMP.

###### Proof:

See APPENDIX C.

## V Numerical Results

We study a clipped compressed sensing problem where follows a symbol-wise Bernoulli-Gaussian distribution, i.e. ,