Due to the propagation characteristics, underwater acoustic communications typically suffer from strong multipath and Doppler effect . Since the propagation speed is low (around 1500 m/s), the reflection of acoustic waves from surface and bottom of the ocean, along with the refraction caused by nonuniformity of sound speed in water, cause significant multipath delay spread. The typical delay spread of underwater acoustic communication channels is from several milliseconds to tens of milliseconds, and could be hundreds of milliseconds and more in extreme cases. On the other hand, due to the movement of transmitter and receiver platforms, as well as water media, Doppler effect is almost inevitable in underwater acoustic communications. Considering the wide band nature of underwater acoustic communication systems, the Doppler effect usually exhibits itself as signal dilation or compression .
With the advantages of low complexity frequency domain multipath channel equalization and high bandwidth efficiency, orthogonal frequency division multiplexing (OFDM) technology was introduced into underwater acoustic communications, and soon became a popular choice especially for high speed communications[3, 4, 5]. However, Doppler effect in underwater acoustic channels could destroy the orthogonality between the subcarriers in OFDM and hence deteriorates its performance. In , a two-step approach was proposed to compensate the Doppler distortion, in which a resampling operation is carried out first to convert the wide band Doppler distortion into narrow band carrier frequency offset (CFO). In the second step, the CFO is estimated and compensated. Experimental results in  demonstrated that the proposed scheme can effectively mitigate the Doppler distortion.
In coherent underwater acoustic communication systems, accurate channel estimation is essential. Since compressed sensing based channel estimation methods exploit the structure of the channel and require less number of measurements, they have been the most popular choices for underwater acoustic OFDM systems [6, 7]. Among compressed sensing algorithms, orthogonal matching pursuit (OMP) algorithm  features lower computational complexity than basis pursuit methods, and it is widely adopted in channel estimations [9, 10]. On the other hand, as an approximation for sum-product algorithm (SPA), approximate message passing (AMP) has been utilized for sparse linear estimation problems , and hence is also employed for sparse channel estimations .
To further improve the performance of wireless communication systems in challenging channel conditions, joint receivers in which modules such as channel estimation, channel equalization interactively work together are proposed, both for radio and underwater acoustic communication systems. In 
, a joint channel estimation, equalization and data detection system based on expectation-maximization (EM) algorithm is proposed for OFDM systems with high mobility. In the proposed scheme, the time-varying channel is represented by discrete cosine orthogonal basis functions and the equalized symbols are quantized into nearest data symbol constellation point in each iteration. A low complexity generalized AMP (GAMP) based joint channel estimation and data decoding approach for OFDM system is proposed. Later, a joint channel estimation and symbol detection scheme based on parametric bilinear GAMP algorithm is proposed for single-carrier systems 
, which supports the sparse characteristic of wireless communication channels and soft-input soft-output (SISO) equalizer. Recently, based on Markov chain Monte Carlo (MC) algorithm with Gibbs sampling and convolutional coding plus differential coding, a joint phase shift keying (PSK) data detection and channel estimation scheme utilizing channel sparsity is proposed in.
For underwater acoustic communications, the concept of joint processing has been widely pursued. In the early work , an iterative receiver based on message passing techniques is proposed for single carrier underwater acoustic communication systems, in which equalization and decoding are performed multiple times in a turbo manner. However, channel estimation is carried out based on training symbols, utilizing maximum-likelihood approach. In [18, 19, 20], iterative receivers were developed for single carrier multiple-input multiple-output (MIMO) underwater acoustic communication systems, for both time-domain and frequency domain equalization. In the proposed iterative receivers, soft equalization outputs are feedback for interference cancelation and channel estimation. For OFDM underwater acoustic communication systems in doubly spread channels, an iterative sparse channel estimation and data detection receiver using partial interval demodulation is proposed in . In , an iterative receiver is proposed for inter-frequency interference (IFI) cancelation for single-carrier underwater acoustic communication systems adopting frequency domain equalization. Channel impulse responses, explicit and residual IFI are updated in an iterative fashion based on the soft symbol estimates in the proposed receiver. Considering the impulse noise in underwater acoustic communication channels, a joint channel estimation and impulsive noise mitigation receiver utilizing sparse Bayesian learning (SBL) algorithm is proposed , in which the information on data symbols are employed to carry out joint time-varying channel estimation and data detection, to further improve the system performance. The performance of the proposed algorithm is verified through both numerical simulations and real data. More recently, a joint channel estimation and equalization scheme with superimposed training is proposed for single carrier underwater acoustic communication systems . A message-passing-based bidirectional channel estimation algorithm is adopted in the proposed scheme, utilizing the correlation between consecutive channels.
It is worth noting that all the above compressed sensing based approaches utilize the fixed dictionary and suffer grid mismatch , and the gridless parametric channel estimation approach indeed improves both the channel estimation and data decoding performance over the grid based approaches [26, 27]. Moreover, existing iterative receivers seldom incorporate the CFO estimation and compensation into the design. In this paper, according to message passing and from the modularized point of view , an iterative receiver based on gridless variational Bayesian line spectra estimation (VALSE)  which jointly estimates the residual CFO and channel, meanwhile also achieves data decoding named JCCD-VALSE is proposed for underwater acoustic OFDM systems. The JCCD-VALSE consists of three modules named as high resolution gridless channel estimation, minimum mean squared error (MMSE) estimation, and low-density parity-check code (LDPC) decoder. Through the exchange of soft information between modules, the channel estimation and data decoding accuracy progressively improves. Compared with existing receivers, the joint processing including residual CFO estimation and compensation, the gridless channel estimation, and the feature of automatical estimation of nuisance parameters are the advantages of the proposed scheme. Hence, the contributions of this paper include:
Targeting at underwater acoustic OFDM systems in channels with an approximately common Doppler scaling factor, the system model for received signal after resampling is derived, and the corresponding receiver JCCD-VALSE which jointly estimates residual CFO, channel, and finishes data decoding is proposed.
In JCCD-VALSE, all the nuisance parameters such as the number of paths, the noise variance are automatically estimated. This reduces the reliance on priori information, and also improves the estimation accuracy.
In JCCD-VALSE, gridless channel estimation based on VALSE is adopted, which improves the channel estimation and data decoding performance, compared with the existing channel estimation and data decoding methods. Besides, it is also shown that taking residual CFO into consideration benefits both the channel estimation and data decoding performances.
Sea trial data of mobile acoustic communication experiment (MACE10) is utilized to further demonstrate the effectiveness of the proposed scheme.
In addition, it is also worth noting that our proposed approach is very flexible and can deal with both uniform and nonuniform pilots.
The rest of this paper is organized as follow: Section II introduces the system model. The probabilistic formulation of the channel, the data symbol and the normalized residual CFO are presented in Section III. The ensuing Section IV proposes the joint CFO, channel estimation and data decoding algorithm. Section V presents the performance metrics in terms of the coded bit error rate (BER) and the normalized mean squared error (NMSE) of the various algorithms. The sea trial data decoding results are introduced in Section VI. Finally, we conclude the paper in Section VII.
Notation: For a matrix , let or denote the th element of , and
returns a vector with elements being the diagonal elements of. Let and denote the real and imaginary part operator, respectively. For random vectors and
with joint probability density function (PDF), the marginal PDF with being . Let denote the Hadamard product operator. For a vector , let return a diagonal matrix with diagonal elements being . Let be a subset of indices. For a square matrix , let denote the submatrix by choosing both the rows and columns of indexed by . Let , and be the conjugate, transpose and Hermitian transpose operator of , respectively. For two vectors and , denotes the componentwise division.
Ii System model
This section establishes the OFDM model similar to , and the assumption that all paths have a similar Doppler scaling factor is adopted, which seems to be justified as long as the dominant Doppler effect is caused by the direct transmitter/receiver motion. For completeness, the detailed derivations are presented.
Consider the cyclic prefix (CP) OFDM system, and let and denote the OFDM symbol duration and the CP duration. The number of subcarriers is and the frequency spacing is . With carrier frequency , the frequency of the th subcarrier is
and the bandwidth is .
Let denote the information symbol to be transmitted on the th subcarrier. The transmitted continuous time OFDM block in passband is
A multipath underwater channel where all paths have a similar Doppler scaling factor is considered, and the channel impulse response is
where and denote the gain and delay of the th path, is the total number of paths. The received continuous time signal in passband is
After downconverting and low pass filtering, the baseband signal can be obtained as
where is the additive noise in baseband. Note that each subcarrier experiences a Doppler-induced frequency shift . Provided that the bandwidth of the OFDM signal is comparable to the center frequency, the Doppler-induced frequency shift differs considerably on different subcarriers, which introduces the intercarrier interference.
In order to mitigate the Doppler effect and convert the wideband OFDM system into a narrowband system, a resampling approach is carried out first . Suppose the Doppler factor is estimated as . Resample with a resampling factor to yield
The resampling factor is chosen to make as close to one as possible. As a result, (II) can be approximated as
From (2), usually CFO estimation and compensation are carried out as the second step to further remove the impact of Doppler . Let denote the output of the CFO compensation, sampling at , yields 111In engineering application, fftshift operator is usually introduced and model (3) can also be formulated as .
where ; the th entry of is ; denotes the normalized DFT matrix with the th entry being ; denotes the normalized residual CFO which is usually very small and close to , ; is the frequency-domain channel response vector and can be viewed as a line spectra, which can be represented as
where denotes the complex coefficient vector related to the path gains and delays of the channel and
denotes the frequency component related to the path time delay of the channel and
denotes the number of paths, is
denotes the noise and approximation error which is assumed to follow the Gaussian distribution, i.e.,with being the unknown variance.
Then, (3) is simplified as
Assume that the symbol vector are partitioned into three parts corresponding to the pilot, null, and data, which are represented as , and and . Then the goal is to jointly estimate the data symbol , the channel vector , the normalized residual CFO , and the nuisance parameters such as by fully exploiting the channel structure . In the following, we design an approximate Bayesian algorithm.
Iii Probabilistic Formulation
This section describes the probabilistic formulation of the channel , the data symbol and the normalized residual CFO .
Iii-a Parametric Channel Model
For the channel , the number of path is usually unknown. Here the number of paths is assumed to be and , i.e.,
where . Since the number of path is supposed to be , the binary hidden variables are introduced, where means that the th path is active, otherwise deactive (). Let
denote the prior probability of each path being active, i.e.,
Then the probability mass function (PMF) ofis . Given that , we assume that , where denotes the priori variance of the path being active. Thus follows a Bernoulli-Gaussian distribution, that is
In addition, the distribution of conditioned on is .
From (7) and (8), it can be seen that the parameter denotes the probability of the th component being active and is a variance parameter. The variable has the prior PDF . Generally, is encoded through the von Mises distribution [30, p. 36]
where and are the mean direction and concentration parameters of the prior of the th frequency , is the modified Bessel function of the first kind and the order [30, p. 348]. For a given von Mises distribution, . In addition, [30, pp. 26]. Without any knowledge of the frequency , the uninformative prior distribution is used. For more details please refer to .
Iii-B OFDM System
A single-input single-output OFDM system with subcarriers is considered. Since the structure of the consecutive OFDM symbols is not exploited, a single OFDM system is modeled 222As you see, the proposed approach outputs the PDFs of the amplitudes and delays of all the paths, which is very suitable for channel tracking and will be left for future work.. Of the subcarriers, and are dedicated as pilots and nulls, respectively, and the remaining are used to transmit the coded/interleaved data bits. The sets , , denote the indices of the null, pilot and data subcarriers, respectively. It follows that and , and . Besides, the null subcarriers are not necessary to delicate and can be set as the empty set to improve the data transmission rate.
Let the vector be the transmitted (equi-probable) information bits. These bits are coded by a rate- encoder and interleaved to get the length- vector , where is the interleaving and coding function such as a turbo code and a LDPC code. The coded/interleaved bits are partioned into subvectors , , which are then mapped to the th subcarrier. Let denote the -ary mapping, where is the symbol alphabet. The pilots are drawn from the pilot symbol alphabet . In OFDM, is typically a -ary quadrature amplitude modulation (QAM) alphabet and is a quadrature phase shift keying (QPSK) alphabet. The nulls, pilots and data symbols are stacked in vector . Vectors , and contain the nulls, pilot symbol and data symbols, respectively.
For the OFDM symbol, the information bits has the following PDF
where denotes the indicator function, denotes the index set of the information bits. For the interleaving and coding function, the PMFs of subvectors conditioned on are
Note that describe the structure of the channel code and interleaver, whose details are not provided. Conditioned on , the PMF of is .
The normalized residual CFO is treated as an unknown deterministic parameter and .
According to the above probabilistic model, the factor graph representation describing the channel and OFDM system model is given in Fig. 1. The joint PDF is
Firstly, the maximum likelihood estimation of the nuisance parameters are
Then the BER optimal receiver, i.e., maximum a posterior (MAP) estimate is
can be obtained via marginalizing all the random variables butin the joint PDF . Note that solving either (III-B) or (12) is computationally intractable. Therefore, we resort to approximate Bayesian methods in the ensuing section.
This section utilizes expectation propagation (EP)  to develop the JCCD-VALSE algorithm through novelly combining the gridless VALSE, the MMSE module and the LDPC decoder module.
It is worth noting that in conventional underwater OFDM systems, null subcarriers are beneficial for both Doppler and noise estimation. While in this work, the pilot and data can be used for both Doppler and noise estimation. As a consequence, the null subcarriers can be replaced with the data subcarriers, which improves the spectrum efficiency. Define the ordered set .
Before designing the algorithm, an important point is emphasized. Note that
This means that
is equivalent to the original observation model (5), where .
It is beneficial to introduce an additional hidden variable defined in (4), and an equivalent factor shown in Fig. 2 (a) can be obtained. Compared to the original factor graph in Fig. 1, a delta factor node  is introduced, which is the key to decompose the original problem into subproblems to be solved by designing the respective modules. According to Fig. 2 (b), the JCCD-VALSE can be summarized as follows: Firstly, initialize the extrinsic message from module A to module B as
and calculate the extrinsic message from module B to module C. Secondly, running the LDPC decoder algorithm in module C to obtain the posterior means and variances of . Thirdly, update the residual CFO and noise variance estimate in module B. Fourthly, calculate the extrinsic message from module B to module A. Finally, calculate the extrinsic message from module A to module B, which closes the algorithm. The details of the above steps are described below.
Iv-a Calculate the Message
According to the belief propagation (BP), the message is
For each , is drawn from a finite alphabet set, and the PMF can be obtained by evaluating the above Gaussian PDF at the points of the symbol alphabet followed by normalization.
Iv-B Calculate the Posterior Means and Variances of in LDPC Module
The message is input to the LDPC module, where the SPA is applied. After several iterations where the change of the loglikelihood ratio of the bits is less than a threshold or the number of iterations exceeds the setting maximum, the beliefs of the data symbols are obtained. These beliefs are further used to calculate the posterior means and variances of the data.
Iv-C Residual CFO and Noise Variance Estimation
The type II approach  and the EM algorithm are used to estimate the normalized residual CFO and noise variance , respectively. The type-II method is also termed as type-II maximum likelihood (ML) method , which aims to to maximize the marginal likelihood in Bayesian models. To estimate the residual CFO via type II approach, the marginal likelihood is obtained. Define
and model (19) can be formulated as
where . Straightforward calculation shows that the posterior means and variances of are
where the posterior means and variances of are obtained in module A, i.e., the VALSE module, see Subsection IV-E (eq. (53a) and eq. (53b)). Suppose that follows Gaussian distribution with diagonal covariance matrix and independent of , i.e., . For the pseudo measurement , it follows . The type II ML estimation problem can be formulated as
One can apply the Newston step to refine the previous estimate as
where . It is worth noting that the residual CFO estimation approach (21) is very general and generalize the null subcarriers based approach in . By letting , , i.e., the posterior variances of corresponding to pilot and data subcarriers tend to infinity, (21) reduces to the null subcarriers based approach . Since we utilize more data to estimate the residual CFO, it makes sense that our approach estimates the residual CFO more accurately, leading to better channel estimation and data decoding performance, as illustrated in the numerical experiments.
Once the residual CFO is updated, the pseudo measurements are refined according to (14).
The EM approach is adopted to estimate the noise variance, which is
where the expectation is taken with respect to the posterior PDF of .
Iv-D Calculate the Extrinsic Message from Module B to Module A
According to variational message passing , the message can be calculated as
where is to ensure that the PDF is normalized and denotes the expectation of with respect to the posterior PDF. It can be seen that is Gaussian distributed.
From (IV-D), one has . Now we calculate the message corresponding to the pilots and data subcarriers, respectively.
For , straightforward calculation shows that
Iv-E Calculate the message
Define as the matrix chosen from the rows of indexed by . The pseudo linear measurement model under heterogenous noise is
where , and .
The VALSE algorithm tries to construct a structured PDF to approximate the true PDF by minimizing their Kullback-Leibler (KL) divergence . For , it is supposed to be factored as
where is restricted to a point estimate. Given that VALSE outputs the approximated PDF , according to EP , the extrinsic message of from module A to module B is
where denotes the projection operation which approximates the target distribution
as Gaussian distribution with moment matching, i.e., the means and variances ofmatches with that of the approximated Gaussian distribution. Let and denote the posterior means and variances of