Bilinear Adaptive Generalized Vector Approximate Message Passing

10/18/2018
by   Xiangming Meng, et al.
Zhejiang University
0

This paper considers the generalized bilinear recovery problem which aims to jointly recover the vector b and the matrix X from componentwise nonlinear measurements Y∼ p( Y| Z)=∏_i,jp(Y_ij|Z_ij), where Z= A( b) X, A(·) is a known affine linear function of b (i.e., A( b)= A_0+∑_i=1^Qb_i A_i with known matrices A_i.), and p(Y_ij|Z_ij) is a scalar conditional distribution which models the general output transform. A wide range of real-world applications, e.g., quantized compressed sensing with matrix uncertainty, blind self-calibration and dictionary learning from nonlinear measurements, one-bit matrix completion, etc., can be cast as the generalized bilinear recovery problem. To address this problem, we propose a novel algorithm called the Bilinear Adaptive Generalized Vector Approximate Message Passing (BAd-GVAMP), which extends the recently proposed Bilinear Adaptive Vector AMP (BAd-VAMP) algorithm to incorporate arbitrary distributions on the output transform. Numerical results on various applications demonstrate the effectiveness of the proposed BAd-GVAMP algorithm.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/31/2018

Bilinear Recovery using Adaptive Vector-AMP

We consider the problem of jointly recovering the vector b and the matri...
09/15/2020

Bilinear Generalized Vector Approximate Message Passing

We introduce the bilinear generalized vector approximate message passing...
05/28/2020

Approximate Message Passing with Unitary Transformation for Robust Bilinear Recovery

Recently, several promising approximate message passing (AMP) based algo...
12/29/2017

A Unified Bayesian Inference Framework for Generalized Linear Models

In this letter, we present a unified Bayesian inference framework for ge...
02/11/2018

TARM: A Turbo-type Algorithm for Affine Rank Minimization

The affine rank minimization (ARM) problem arises in many real-world app...
11/02/2020

Compressed Sensing with Upscaled Vector Approximate Message Passing

Recently proposed Vector Approximate Message Passing (VAMP) demonstrates...
10/23/2019

A Unifying Framework of Bilinear LSTMs

This paper presents a novel unifying framework of bilinear LSTMs that ca...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In this work, we consider the generalized bilinear recovery problem: jointly estimate the vector

and the matrix from componentwise and probabilistic measurements , where , is a known affine linear function of (i.e., with known matrices .). This problem arises in a wide range of applications in the field of signal processing and computer science. For example, compressed sensing under matrix uncertainty [1, 2, 3, 4], matrix completion [5, 6, 7], robust principle component analysis (RPCA) [8], dictionary learning [9, 10], joint channel and data decoding [11, 12, 13] can all be formulated as generalized bilinear recovery problem. Generally the scalar conditional distribution models arbitrary componentwise measurement process in a probabilistic manner. Specially, corresponds to the scenario of linear measurements, i.e., , where

denotes a Gaussian distribution with mean and variance being

and . In practice, however, the measurements are often obtained in a nonlinear way. For example, quantization is a common nonlinear measurement process in analog-to-digital converter (ADC) that maps the input signal from continuous space to discrete space, which has been widely used in (one-bit) compressed sensing [14], millimeter massive multiple input multiple output (MIMO) system[15],[16], etc. As a result, it is of high significance to study the generalized bilinear recovery problem.

There has been extensive research on this active field in the past few years, including the convex relaxation methods [17, 18], variational methods [19], approximate message passing (AMP) methods such as bilinear generalized AMP (BiGAMP) [9, 10] and parametric BiGAMP (PBiGAMP) [20]

, etc. It was shown that the AMP based methods are competitive in terms of phase transition and computation time

[9, 10, 20, 21]. However, as the measurement matrix deviates from the i.i.d. Gaussian, the AMP may diverge [22, 23]. To improve convergence of AMP, vector approximate message passing (VAMP)[24] and orthogonal AMP (OAMP) [25] have been recently proposed, which achieve good convergence performance for any right-rotationally invariant measurement matrices and can be rigorously characterized by the scalar state evolution. For the generalized linear model, AMP is extended to generalized approximate message passing (GAMP) [26, 27]. Later, generalized VAMP [28] and generalized expectation consistent algorithm [29] are proposed to handle a class of right-rotationally invariant measurement matrices. In [30]

, a unified Bayesian inference framework is provided and some insights into the relationship between AMP (VAMP) and GAMP (GVAMP) are presented. Due to the improved convergence of VAMP over AMP on general measurement matrices, many works have been done to extend VAMP to deal with the bilinear recovery problem

[31, 32]. In [31], lifted VAMP is proposed for standard bilinear inference problem such as compressed sensing with matrix uncertainty and self-calibration. However, lift VAMP suffers from high computational complexity since the number of unknowns increases significantly, especially when the number of original variables is large. To overcome the computation issue, the bilinear adaptive VAMP (BAd-VAMP) has been proposed very recently in [33] which avoids lifting and instead builds on the adaptive AMP framework [35, 34]. Nevertheless, BAd-VAMP is only applicable to linear measurements which limits its usage in the generalized bilinear recovery problem.

In this paper, we propose a new algorithm called the bilinear adaptive generalized vector AMP (BAd-GVAMP), which extends the BAd-VAMP [33] from linear measurements to nonlinear measurements. Specifically, a novel factor graph representation of the generalized bilinear problem is first proposed by incorporating the Dirac delta function. Then, by using the expectation propagation (EP) [36], we decouple the original generalized bilinear recovery problem into two modules: one module performs componentwise minimum mean square error (MMSE) estimate while the other performs BAd-VAMP with some slight modification of the message passing schedule. Furthermore, the messages exchanging between the two modules are derived to obtain the final BAd-GVAMP. Interestingly, BAd-GVAMP reduces to the BAd-VAMP under linear measurements. Numerical results are conducted for quantized compressed sensing with matrix uncertainty, self-calibration as well as structured dictionary learning from quantized measurements, which demonstrates the effectiveness of the proposed algorithm.

I-a Notation

Let

denote a Gaussian distribution of the random variable

with mean and covariance matrix . Let , , , and denote the transpose operator, the Frobenius norm, the

norm, the probability density function (PDF) and the Dirac delta function, respectively. Let

denote the average for .

Ii Problem Setup

Consider the generalized bilinear recovery problem as follows: jointly estimate the matrix and the parameters from the componentwise probabilistic measurements , i.e.,

(1a)
(1b)
(1c)

where denotes the nonlinear observations, is a known matrix-valued linear function parameterized by the unknown vector , is the prior distribution of parameterized by , is the componentwise probabilistic output distribution conditioned on and parameterized by . Given the above statistical model, the goal is to compute the maximum likelihood (ML) estimate of and the MMSE estimate of , i.e.,

(2)
(3)

where is the likelihood function of

and the expectation is taken with respect to the posterior probability density distribution

(4)

where is

(5)

However, exact ML estimate of and exact MMSE estimate of is intractable due to high-dimensional integration. As a result, approximate methods need to be designed in practice.

Iii Biliear Adaptive Generalized VAMP

In this section, we propose an efficient algorithm to approximate the ML estimate of and MMSE estimate of . The resultant BAd-GVAMP algorithm is an extension of BAd-VAMP from linear measurements to nonlinear measurements. To begin with, we first present a novel factor graph representation of the statistical model. By introducing a hidden variable and a Dirac delta function

, the joint distribution in (

5) can be equivalently factored as

(6)

The corresponding factor graph of (6) is shown in Fig. 1 (a). The circles and squares denote the variable and factor node, respectively. Such alternative factor graph representation plays a key role in the design of our approximate estimation algorithm. Now we will derive the BAd-GVAMP algorithm based on the presented factor graph in Fig. 1 (a) and the EP [36]. As one kind of approximate inference methods, EP approximates the target distribution p with an exponential family distribution (usually Gaussian) set which minimizes the Kullback-Leibler (KL) divergence , i.e., . For Gaussian distribution set

, EP amounts to moment matching, i.e., the first and second moments of distribution

matches those of the target distribution. For more details of EP and its relation to AMP methods, please refer to [36, 24, 38, 39, 40, 41].

To address the generalized bilinear recovery problem, specifically, we choose the projection set to be Gaussian with scalar covariance matrix, i.e., diagonal matrix whose diagonal elements are equal 111Note that general diagonal matrix can also be used.. Then, using EP on the factor graph in Fig. 1, we decouple the original generalized bilinear recovery problem into two modules: the componentwise MMSE module and the BAd-VAMP module. The two modules interact with each other iteratively with extrinsic messages exchanging between them. The detailed derivation of BAd-GVAMP is presented as follows.

Iii-a Componentwise MMSE module


Fig. 1: The factor graph and inference module of the BAd-GVAMP algorithm.

Suppose that in the -th iteration, the message from factor node to variable node follows Gaussian distribution, i.e.,

(7)

where refers to the factor node . According to EP, the message from variable node to the factor node can be calculated as

(8)
(9)

where denotes identity up to a normalizing constant. First, we perform componentwise MMSE and obtain the posterior means and variances of as

(10)
(11)

where and are the mean and variance operations taken (componentwise) with respect to the distribution (9). Then the posterior variances are averaged over which yields

(12)

so that is approximated as

(13)

As a result, the message from the variable node to the factor node can be calculated (componentwise) as

(14a)

where the extrinsic means and variances are

(15)
(16)

To learn the unknown parameter , EM can be adopted [37], i.e.,

(17)

where is given by (13).

Iii-B BAd-VAMP module


Fig. 2: Two equivalent factor graphs for the pseudo linear observation model (18). Note that Fig. 2 (a) is the proposed factor graph which novelly introduces the delta function, and Fig. 2 (b) is the factor graph proposed by [24].

As shown in (14), the message from the variable node to the factor node follows Gaussian distribution . Referring to the definition of the for the factor node, we obtain a pseudo linear observation equation as

(18)

where , and . The factor graph corresponding to (18) is shown Fig. 2, where the dash square is used to indicate pseudo observations. As a result, the BAd-VAMP algorithm [33] for the standard bilinear recovery problem can be applied. For completeness and ease of reference, we present the derivation of BAd-VAMP in [33] based on the factor graph shown in Fig. 2 (b), in which replicas of are introduced, i.e., . In the following, let and denote the th column of and , respectively. Assume that the message transmitted from the factor node to the variable node is

(19)

where refers to the factor node . Note that can be viewed as the prior of . Combining the pseudo observation equation (18) with , the linear MMSE (LMMSE) estimate of is performed and the posterior distribution of is obtained as

(20)

where the posterior mean and covariance matrix are

(21)
(22)

In addition, the EM algorithm is incorporated to learn and update the pseudo noise precision , i.e.,

(23a)
(23b)

Specifically, for the affine-linear model , the detailed expression of estimating and are given by [33]

(24a)
(24b)

where

(25a)
(25b)

and is given by (22).

The message from the variable node to the delta node is calculated as

(26)

where is defined in (20). Projecting the posterior distribution to the Gaussian distribution with scalar covariance matrix yields

(27)

where

(28)

Substituting (27) in (26), we obtain

(29)

where

(30)
(31)

According to the definition of the factor node , the message satisfies

(32a)

Combining the prior with , the posterior mean and variances of are calculated as

(33)
(34)

where

(35)

To learn the unknown parameters and , EM algorithm is applied in the inner iterations [33], i.e.,

(36)

and

(37)
(38)

Now the message from the variable node to the factor node is calculated as

(39a)

where is (35), and are given by

(40)
(41)

According to the definition of the factor node , the message from the factor node to the variable node is , which closes the BAd-VAMP algorithm.

Iii-C Messages from BAd-VAMP module to MMSE module

After performing BAd-VAMP for one or more iterations, we now focus on how to calculate the extrinsic message from the BAd-VAMP module to the component-wise MMSE module. Referring to the original factor graph shown in Fig. 1 (a), according to EP, the extrinsic message can be calculated as

(42a)

In BAd-VAMP, as shown in the above subsection B, we have already obtained the message from the factor node to the variable node . It can be seen from Fig. 2 that the message is the same as so that . After some algebra, the posterior distribution of can be calculated to be Gaussian, i.e., , with the covariance matrix and mean vector being

(43)
(44)

Then, the posterior distribution of is further projected to Gaussian distribution with scalar covariance matrix, yielding

(45)

where

(46)

Moreover, the posterior variances are averaged over the index , which leads to

(47)

by which is approximated as . As a result, the message in (42) becomes

(48)

where

(49)
(50)

which closes the loop of the whole algorithm.

To sum up, the BAd-GVAMP algorithm can be summarized as Algorithm 1.

1:  Initialization: , , , , and
2:  for   do
3:     Compute the posterior mean and variance of as (10), (12).
4:     Compute the extrinsic mean and variance of as (15), (16), and set and in (18).
5:     for   do
6:        Perform the LMMSE estimate of , i.e., the posterior means and covariance matrix shown in (21) and (22).
7:        Update (23a) and (23b).
8:     end for
9:     Calculate (21) and (28).
10:     Calculate (31) and (30).
11:     for   do
12:        Perform the input denoising operation to obtain the posterior means (34) and variances (33).
13:        Update (36).
14:        Calculate (41) and (40).
15:     end for
16:     Calculate the posterior means (44) and variance (47).
17:     Calculate the extrinsic means (50) and variance (49).
18:     Update as (III-A).
19:  end for
20:  Return and .
Algorithm 1 Bilinear adaptive generalized VAMP (BAd-GVAMP)

Iii-D Relation of BAd-GVAMP to BAd-VAMP

The obtained BAd-GVAMP algorithm is an extension of BAd-VAMP from linear measurements to nonlinear measurements. Intuitively, as shown in Fig 1 (b), BAd-GVAMP iteratively reduces the original generalized bilinear recovery problem to a sequence of standard bilinear recovery problems. In each iteration of BAd-GVAMP, a pseudo linear measurement model is obtained and one iteration of BAd-VAMP is performed 222It is also possible to perform multiple iterations of the BAd-VAMP in a whole single iteration of BAd-GVAMP.. Note that the message passing schedule of the BAd-VAMP module within BAd-GVAMP is different from the original BAd-VAMP in [33]: in [33] variable de-noising is performed first and then LMMSE, while in the BAd-VAMP module of the proposed BAd-GVAMP, LMMSE is performed first and then variable de-noising. It is worth noting that in the special case of linear measurements, i.e., when is Gaussian, i.e., , the BAd-GVAMP reduces to BAd-VAMP precisely since in such case the extrinsic means and variances from the MMSE module always satisfy