I Introduction
Line spectral estimation [1] aiming to extract the frequencies of a mixture of sinusoids is a fundamental problem in signal processing fields, as it arises in numerous disciplines in engineering, physics, and natural sciences such as radar signal processing, channel estimation, etc [2, 3]. Classical methods including maximum likelihood (ML) [4] or subspace methods such as MUSIC and ESPRIT [5, 6, 7]. The ML approach involves maximizing a nonconvex function which has a multimodal shape with a sharp global maximum, which requires accurate initialization. Compared to subspace based approaches, ML performs better when the measurements or the SNR is small.
Compressed sensing (CS) based approaches which exploit the sparsity of the frequency domain are proposed. By discretizing the frequency into a number of grids to construct a dictionary matrix, the original nonlinear frequency estimation problem is transformed to be a sparse linear inverse problem [8]. Since the frequency is continuousvalued, discretization incurs grid mismatch [9]. As a result, offgrid methods are proposed, which refine the grid gradually to overcome the grid mismatch effectively [10, 11, 12, 13, 14].
To work directly with continuously parameterized dictionaries, gridless methods are proposed and can be classified into two categories: atomic norm based
[15, 16, 17, 18, 19, 20, 21, 22]and variational LSE (VALSE) [23, 24]approaches. The atomic norm based approaches allow for working with an infinite, continuous dictionary and are proved to recover the well separated frequencies perfectly in the noiseless case. While for the VALSE, it treats the frequency as random variables, and it automatically estimates the model order, the noise variance and the posterior probability density functions (PDF) of the frequencies. As for the computation complexity, the atomic norm based approaches involve solving a semidefinite programming (SDP), which is usually very high. In contrast, VALSE can be implemented with much lower computation complexity.
From the Bayesian algorithm point of view, approximate message passing (AMP) [25] and generalized approximate message passing (GAMP) [26] are proposed to deal with the sparse signal recovery from linear and nonlinear measurements. It is shown that both AMP and GAMP can be derived from expectation propagation (EP) [27, 28, 29, 30, 31, 32]
. By treating the frequency and amplitudes as random variables and formalize the factor graph, we derive EP LSE (EPLSE). We adopt the Bernoulli Gaussian prior distribution to estimate the model order. In addition, expectation maximization (EM) is incorporated to iteratively learn both the parameters of the prior and output distribution
[33].For a univariate function , let and denote the first and second derivative of , respectively.
denote the complex normal distribution of
with mean and covariance , and let denote the von Mises distribution of with mean direction and concentration parameter . Let return the real part. Letdenote the identity matrix of dimension
. For a distribution , let with mean and variances . For a set , letdenotes its cardinality. For a complex vector
, returns its elementwise magnitude.Ii Problem Setup
Let be a line spectrum signal consisting of complex sinusoids
(1) 
where is the complex amplitude of the th frequency, is the th frequency, and
(2) 
The LSE undergoes a componentwise (linear or nonlinear) transform, which is described as a conditional PDF
(3) 
where are the unknown nuisance parameters. In the following text, we illustrate some examples about .

The LSE is corrupted by the additive white Gaussian noise and is described by
(4) where , is the variance of the noise.

Incomplete measurements, where only a subset of the measurements is observed.

Quantized measurements [36]: , where denotes a quantizer which maps the continuous values into discrete numbers.

Magnitude measurements [37]: , or more generally where is also AWGN.

Impulsive noise [40]: and is nonGaussian. For example, denotes the impulsive noise and follows a Laplace distribution.
For all the listed cases, refers to the variance of the noise.
Since the sparsity level is usually unknown, the line spectrum signal consisting of complex sinusoids is assumed [23]
(5) 
where and . For the frequencies and coefficients, i.i.d. Bernoulli Gaussian prior distribution is used, i.e.,
(6) 
where
(7) 
are unknown parameters. Note that Bernoulli Gaussian distribution can be imposed to enforce sparsity. For the prior distribution
, von Mises distribution can be encoded [34]. For uninformative prior, we assume , .Let
(8)  
(9) 
be the set of all random variables and the model parameters, respectively. According to the Bayes rule, the joint PDF is
(10) 
Given the above joint PDF (10), the type II maximum likelihood (ML) estimation of the model parameters is
(11) 
Then the minimum mean square error (MMSE) estimate of the parameters is
(12) 
where the expectation is taken with respect to
(13) 
Directly solving the ML estimate of (11) or the MMSE estimate of (12) are both intractable. As a result, an iterative algorithm is designed in Section III.
Iii EPLSE Algorithm
The factor graph is presented in Fig. 1 and a delta factor node is introduced ^{1}^{1}1Such a factor graph was originally introduced in [30]
for the generalized linear model (GLM), and a unified Bayesian inference framework for GLM was proposed.
. Before derivation, the following notations in Table I is used.The message from the factor node to the variable node  
The message from the variable node to the factor node  
The message from the factor node to the variable node  
The message from the factor node to the variable node  
The message from the variable node to the factor node  
The message from the variable node to the factor node 
First, we initialize , and . Then we show how to update the messages to perform line spectral estimation.
Iiia Update
According to EP, can be updated as
(14) 
Then we calculate the componentwise posterior means and variances of with respect to as
(15)  
(16) 
Therefore is ^{2}^{2}2Note that for the linear measurement model , we have , i.e., the projection operation is unnecessary.
(17) 
According to (14), is calculated to be
(18) 
where and are
(19a)  
(19b) 
We also incorporate the expectation maximization (EM) algorithm to learn . The posterior distribution of is approximated as (17). We compute the expected complete loglikelihood function with respect to , and drop the irrelevant terms to have
(20) 
Then is updated as
(21) 
For the AWGN model
(22) 
where , the posterior PDF is . Therefore we obtain
(23) 
For arbitrary , we also obtain an approximate update equation. From the definition of and , we obtain an pseudo measurement model
(24) 
where , and . The noise variance is updated as [41]
(25) 
Note that (25) includes (23) as a special case as for the AWGN case, .
IiiB Updating of
The subfactor graph is presented in Fig. 3. According to EP, is updated as
(26) 
Now we calculate the numerator term. First, and implies the following pseudo measurement model
(27) 
where . Conditioned on , we approximate as a Gaussian distribution by averaging over and . Straightforward calculation yields
(28)  
(29) 
which does not depend on and we use in the following text. Now we perform integration (IIIB) with respect to . Conditioned on , by viewing
(30) 
and
(31) 
as the prior and likelihood of , respectively, we perform integration over to yield
(32) 
Then is
(33)  
(34) 
where
(35) 
Then we project as a von Mises distribution
(36) 
Note that the PDF proportional to is a product of wrapped normal distribution and von Mises distribution. Numerically, it was found that the maximum of is near . Thus we perform a single Newton refinement at and obtain the approximated von Mises distribution, as shown in VIA.
As a result, using (IIIB) we have
(37) 
IiiC Updating of
The subfactor graph is presented in Fig. 4.
(38) 
where
(39) 
Given that we do not have any knowledge of prior distribution , and is calculated to be
(40) 
IiiD Updating of
The subfactor graph is presented in Fig. 5. can be updated as
(41)  
(42) 
Similar to (IIIB), can be equivalently formulated as
(43) 
Conditioned on , we approximate as a Gaussian distribution by averaging over and . Straightforward calculation yields
(44)  
(45)  
(46)  
(47) 
Thus
(48) 
Then we project as a Gaussian distribution. For simplicity, we first project as a Gaussian distribution. Then we have
(49) 
According to (IIID), we approximate and as
(50a)  
(50b) 
Comments
There are no comments yet.