In this work, we consider the generalized bilinear recovery problem: jointly estimate the vectorand the matrix from componentwise and probabilistic measurements , where , is a known affine linear function of (i.e., with known matrices .). This problem arises in a wide range of applications in the field of signal processing and computer science. For example, compressed sensing under matrix uncertainty [1, 2, 3, 4], matrix completion [5, 6, 7], robust principle component analysis (RPCA) , dictionary learning [9, 10], joint channel and data decoding [11, 12, 13] can all be formulated as generalized bilinear recovery problem. Generally the scalar conditional distribution models arbitrary componentwise measurement process in a probabilistic manner. Specially, corresponds to the scenario of linear measurements, i.e., , where and . In practice, however, the measurements are often obtained in a nonlinear way. For example, quantization is a common nonlinear measurement process in analog-to-digital converter (ADC) that maps the input signal from continuous space to discrete space, which has been widely used in (one-bit) compressed sensing , millimeter massive multiple input multiple output (MIMO) system,, etc. As a result, it is of high significance to study the generalized bilinear recovery problem.
There has been extensive research on this active field in the past few years, including the convex relaxation methods [17, 18], variational methods , approximate message passing (AMP) methods such as bilinear generalized AMP (BiGAMP) [9, 10] and parametric BiGAMP (PBiGAMP) 
, etc. It was shown that the AMP based methods are competitive in terms of phase transition and computation time[9, 10, 20, 21]. However, as the measurement matrix deviates from the i.i.d. Gaussian, the AMP may diverge [22, 23]. To improve convergence of AMP, vector approximate message passing (VAMP) and orthogonal AMP (OAMP)  have been recently proposed, which achieve good convergence performance for any right-rotationally invariant measurement matrices and can be rigorously characterized by the scalar state evolution. For the generalized linear model, AMP is extended to generalized approximate message passing (GAMP) [26, 27]. Later, generalized VAMP  and generalized expectation consistent algorithm  are proposed to handle a class of right-rotationally invariant measurement matrices. In 
, a unified Bayesian inference framework is provided and some insights into the relationship between AMP (VAMP) and GAMP (GVAMP) are presented. Due to the improved convergence of VAMP over AMP on general measurement matrices, many works have been done to extend VAMP to deal with the bilinear recovery problem[31, 32]. In , lifted VAMP is proposed for standard bilinear inference problem such as compressed sensing with matrix uncertainty and self-calibration. However, lift VAMP suffers from high computational complexity since the number of unknowns increases significantly, especially when the number of original variables is large. To overcome the computation issue, the bilinear adaptive VAMP (BAd-VAMP) has been proposed very recently in  which avoids lifting and instead builds on the adaptive AMP framework [35, 34]. Nevertheless, BAd-VAMP is only applicable to linear measurements which limits its usage in the generalized bilinear recovery problem.
In this paper, we propose a new algorithm called the bilinear adaptive generalized vector AMP (BAd-GVAMP), which extends the BAd-VAMP  from linear measurements to nonlinear measurements. Specifically, a novel factor graph representation of the generalized bilinear problem is first proposed by incorporating the Dirac delta function. Then, by using the expectation propagation (EP) , we decouple the original generalized bilinear recovery problem into two modules: one module performs componentwise minimum mean square error (MMSE) estimate while the other performs BAd-VAMP with some slight modification of the message passing schedule. Furthermore, the messages exchanging between the two modules are derived to obtain the final BAd-GVAMP. Interestingly, BAd-GVAMP reduces to the BAd-VAMP under linear measurements. Numerical results are conducted for quantized compressed sensing with matrix uncertainty, self-calibration as well as structured dictionary learning from quantized measurements, which demonstrates the effectiveness of the proposed algorithm.
Ii Problem Setup
Consider the generalized bilinear recovery problem as follows: jointly estimate the matrix and the parameters from the componentwise probabilistic measurements , i.e.,
where denotes the nonlinear observations, is a known matrix-valued linear function parameterized by the unknown vector , is the prior distribution of parameterized by , is the componentwise probabilistic output distribution conditioned on and parameterized by . Given the above statistical model, the goal is to compute the maximum likelihood (ML) estimate of and the MMSE estimate of , i.e.,
where is the likelihood function of
and the expectation is taken with respect to the posterior probability density distribution
However, exact ML estimate of and exact MMSE estimate of is intractable due to high-dimensional integration. As a result, approximate methods need to be designed in practice.
Iii Biliear Adaptive Generalized VAMP
In this section, we propose an efficient algorithm to approximate the ML estimate of and MMSE estimate of . The resultant BAd-GVAMP algorithm is an extension of BAd-VAMP from linear measurements to nonlinear measurements. To begin with, we first present a novel factor graph representation of the statistical model. By introducing a hidden variable and a Dirac delta function
, the joint distribution in (5) can be equivalently factored as
The corresponding factor graph of (6) is shown in Fig. 1 (a). The circles and squares denote the variable and factor node, respectively. Such alternative factor graph representation plays a key role in the design of our approximate estimation algorithm. Now we will derive the BAd-GVAMP algorithm based on the presented factor graph in Fig. 1 (a) and the EP . As one kind of approximate inference methods, EP approximates the target distribution p with an exponential family distribution (usually Gaussian) set which minimizes the Kullback-Leibler (KL) divergence , i.e., . For Gaussian distribution set
, EP amounts to moment matching, i.e., the first and second moments of distributionmatches those of the target distribution. For more details of EP and its relation to AMP methods, please refer to [36, 24, 38, 39, 40, 41].
To address the generalized bilinear recovery problem, specifically, we choose the projection set to be Gaussian with scalar covariance matrix, i.e., diagonal matrix whose diagonal elements are equal 111Note that general diagonal matrix can also be used.. Then, using EP on the factor graph in Fig. 1, we decouple the original generalized bilinear recovery problem into two modules: the componentwise MMSE module and the BAd-VAMP module. The two modules interact with each other iteratively with extrinsic messages exchanging between them. The detailed derivation of BAd-GVAMP is presented as follows.
Iii-a Componentwise MMSE module
Suppose that in the -th iteration, the message from factor node to variable node follows Gaussian distribution, i.e.,
where refers to the factor node . According to EP, the message from variable node to the factor node can be calculated as
where denotes identity up to a normalizing constant. First, we perform componentwise MMSE and obtain the posterior means and variances of as
where and are the mean and variance operations taken (componentwise) with respect to the distribution (9). Then the posterior variances are averaged over which yields
so that is approximated as
As a result, the message from the variable node to the factor node can be calculated (componentwise) as
where the extrinsic means and variances are
To learn the unknown parameter , EM can be adopted , i.e.,
where is given by (13).
Iii-B BAd-VAMP module
As shown in (14), the message from the variable node to the factor node follows Gaussian distribution . Referring to the definition of the for the factor node, we obtain a pseudo linear observation equation as
where , and . The factor graph corresponding to (18) is shown Fig. 2, where the dash square is used to indicate pseudo observations. As a result, the BAd-VAMP algorithm  for the standard bilinear recovery problem can be applied. For completeness and ease of reference, we present the derivation of BAd-VAMP in  based on the factor graph shown in Fig. 2 (b), in which replicas of are introduced, i.e., . In the following, let and denote the th column of and , respectively. Assume that the message transmitted from the factor node to the variable node is
where refers to the factor node . Note that can be viewed as the prior of . Combining the pseudo observation equation (18) with , the linear MMSE (LMMSE) estimate of is performed and the posterior distribution of is obtained as
where the posterior mean and covariance matrix are
In addition, the EM algorithm is incorporated to learn and update the pseudo noise precision , i.e.,
Specifically, for the affine-linear model , the detailed expression of estimating and are given by 
and is given by (22).
The message from the variable node to the delta node is calculated as
where is defined in (20). Projecting the posterior distribution to the Gaussian distribution with scalar covariance matrix yields
According to the definition of the factor node , the message satisfies
Combining the prior with , the posterior mean and variances of are calculated as
To learn the unknown parameters and , EM algorithm is applied in the inner iterations , i.e.,
Now the message from the variable node to the factor node is calculated as
where is (35), and are given by
According to the definition of the factor node , the message from the factor node to the variable node is , which closes the BAd-VAMP algorithm.
Iii-C Messages from BAd-VAMP module to MMSE module
After performing BAd-VAMP for one or more iterations, we now focus on how to calculate the extrinsic message from the BAd-VAMP module to the component-wise MMSE module. Referring to the original factor graph shown in Fig. 1 (a), according to EP, the extrinsic message can be calculated as
In BAd-VAMP, as shown in the above subsection B, we have already obtained the message from the factor node to the variable node . It can be seen from Fig. 2 that the message is the same as so that . After some algebra, the posterior distribution of can be calculated to be Gaussian, i.e., , with the covariance matrix and mean vector being
Then, the posterior distribution of is further projected to Gaussian distribution with scalar covariance matrix, yielding
Moreover, the posterior variances are averaged over the index , which leads to
by which is approximated as . As a result, the message in (42) becomes
which closes the loop of the whole algorithm.
To sum up, the BAd-GVAMP algorithm can be summarized as Algorithm 1.
Iii-D Relation of BAd-GVAMP to BAd-VAMP
The obtained BAd-GVAMP algorithm is an extension of BAd-VAMP from linear measurements to nonlinear measurements. Intuitively, as shown in Fig 1 (b), BAd-GVAMP iteratively reduces the original generalized bilinear recovery problem to a sequence of standard bilinear recovery problems. In each iteration of BAd-GVAMP, a pseudo linear measurement model is obtained and one iteration of BAd-VAMP is performed 222It is also possible to perform multiple iterations of the BAd-VAMP in a whole single iteration of BAd-GVAMP.. Note that the message passing schedule of the BAd-VAMP module within BAd-GVAMP is different from the original BAd-VAMP in : in  variable de-noising is performed first and then LMMSE, while in the BAd-VAMP module of the proposed BAd-GVAMP, LMMSE is performed first and then variable de-noising. It is worth noting that in the special case of linear measurements, i.e., when is Gaussian, i.e., , the BAd-GVAMP reduces to BAd-VAMP precisely since in such case the extrinsic means and variances from the MMSE module always satisfy