I Introduction
Orthogonal frequency division multiplexing (OFDM) is widely adopted in several wired and wireless communication standards, such as digital audio broadcasting (DAB) [1], digital video broadcasting terrestrial (DVBT) [2], worldwide interoperability for microwave access (WiMAX) technologies [3], and the Long Term EvolutionAdvanced (LTEA) standard [4]. OFDM is also a strong candidate for the fifth generation of wireless networks (5G) [5].
One of the main advantages of OFDM is that each subcarrier experiences flat fading even though the overall signal spectrum suffers from frequencyselective fading. Moreover, incorporating the concept of cyclic prefix (CP) prevents intersymbolinterference (ISI) if the CP length is larger than the maximum delay spread of the channel. Consequently, a lowcomplexity singletap equalizer can be utilized to eliminate the impact of the multipath fading channel. Under such circumstances, the OFDM demodulation process can be performed once the fading parameters at each subcarrier, commonly denoted as channel state information (CSI), is known accurately. Therefore, robust channel estimation techniques should be invoked to avoid performance degradation [6][18].
In general, channel estimation can be classified into blind
[6][11], and pilotaided techniques [12][18]. Blind channel estimation techniques are spectrally efficient because they do not require any overhead to estimate the CSI, nevertheless, such techniques have not yet been adopted in practical OFDM systems. Conversely, pilotbased CSI estimation is preferred for practical systems, because typically it is more robust and less complex. In pilotbased CSI estimation, the pilot symbols are embedded within the subcarriers of the transmitted OFDM signal in time and frequency domain; hence, the pilots form a two dimensional (2D) grid [2], [4]. The density of the pilot symbols depends on the frequencyselectivity and time variation of the channel, or equivalently, the coherence bandwidth and coherence time of the channel. The channel response at the pilot symbols can be obtained using leastsquare (LS) frequency domain estimation; and the channel parameters at other subcarriers can be obtained using various interpolation techniques [19]. The density of the pilot grid and the interpolation technique used creates a compromise among the error performance, spectral efficiency, and computational complexity. The spectral efficiency is determined by the pilots’ density, which has to satisfy the 2D sampling theorem. The computational complexity is determined by the interpolation technique, optimal interpolation requires a 2D Wiener filter that exploits the time and frequency correlation of the channel, however, it is substantially complex to implement [20]. In timevarying channels, the spectral efficiency can be enhanced by changing the pilots’ grid structure adaptively based on the channel conditions [21]. The complexity can be reduced by decomposing the 2D interpolation process into two cascaded 1D processes, and then, using less computationallyinvolved interpolation schemes [22], [23]. Low complexity interpolation, however, is usually accompanied with error rate performance degradation [23]. It is also worth noting that most practical OFDMbased systems utilize a fixed grid pattern structure [2], [4]. Moreover, the standards allow changing the power of the pilot symbols based on the CSI. At low SNRs, the pilot symbols’ power can be boosted by an additional or dB [4].Once the channel parameters are obtained for all subcarriers, the received samples at the output of the fast Fourier transform (FFT) are equalized to compensate for the channel fading. Fortunately, the equalization for OFDM is performed in the frequency domain using singletap equalizers. The equalizer output samples, which are denoted as the decision variables, will be applied to a maximum likelihood detector (MLD) to regenerate the information symbols.
Unlike conventional OFDM detection, this work presents a new approach to regenerate the information symbols directly from the received samples at the FFT output. Thus, there is no need to perform channel estimation, interpolation, equalization, or detection operations. The proposed system exploits the fact that the channel coefficients over adjacent subcarriers are highly correlated and approximately equal, and hence, such information is used to estimate the transmitted data sequence. Consequently, the proposed detector is denoted as direct data detector (
The rest of this paper is organized as follows. The OFDM system and channel models are described in Section II. The proposed is presented in Section III, and efficient implementation of the is explored in Section IV
. The system error probability performance analysis is presented in Section
V. Complexity analysis of the conventional pilot based OFDM and the are given in Section VI. Numerical results are discussed in Section VII, and finally, the conclusion is drawn in Section VIII.In what follows, unless otherwise specified, uppercase boldface and blackboard letters such as and , will denote matrices, whereas lowercase boldface letters such as
will denote row or column vectors with
elements. Uppercase, lowercase, or bold letters with a tilde such as will denote trial values, and symbols with a hat, such as , will denote the estimate of . Letters with apostrophe such as are used to denote the next value, i.e., . Furthermore, denotes the expectation operation.Ii Signal and Channel Models
Consider an OFDM system with subcarriers modulated by a sequence of complex data symbols , , , . The data symbols are selected uniformly from a general constellation such as ary phase shift keying (MPSK), quadrature amplitude modulation (QAM) or ary amplitude shift keying (MASK). In conventional pilotaided OFDM systems [24], of the subcarriers are allocated for pilot symbols, which can be used for channel estimation/synchronization purposes. The modulation process in OFDM can be implemented efficiently using an point inverse FFT (IFFT) algorithm, where its output during the th OFDM block can be written as,
(1) 
where is the normalized FFT matrix, and hence, is the IFFT matrix. The elements of are defined as where and denote the row and column indices , respectively. In order to simplify the notation, the block index is dropped for the remaining parts of the paper unless it is necessary to include it. To combat ISI between consecutive OFDM symbols and maintain the subcarriers’ orthogonality in frequencyselective multipath fading channels, a CP of length samples, no less than the channel maximum delay spread (), is formed by copying the last samples of and appending them in front of the IFFT output to compose the OFDM symbol with a total length samples and a duration of seconds. Then, the complex baseband OFDM symbol during the th signaling period is upsampled, filtered and upconverted to a radio frequency centered at before transmission through the antenna.
At the receiver frontend, the received signal is downconverted to baseband and sampled at a rate . In this work, the channel is assumed to be composed of independent multipath components each of which has a gain and delay , where , ,, . A quasistatic channel is assumed throughout this work, and thus, the channel taps are considered constant over one OFDM symbol, but they may change over two consecutive symbols. Therefore, the received sequence consists of samples, and can be expressed as,
(2) 
where the channel matrix is an Toeplitz matrix with on the principal diagonal and ,, on the minor diagonals, respectively, the elements of the noise vector
are modeled as a complex additive white Gaussian noise (AWGN) random variables with zero mean and variance
. The received non CP samples that belong to a single OFDM symbol can be expressed as,(3) 
where denotes the modulo operation. Subsequently, the receiver discards the first samples, and computes the FFT of , where , the channel matrix is an circulant matrix. Therefore, the FFT output can be computed as
(4) 
Because the matrix is circulant, it will be diagonalized by the FFT and IFFT matrices. Thus,
(5) 
where , is the FFT of the noise vector , and denotes the channel frequency response (CFR)
(6) 
By noting that where is an identity matrix, then it is straightforward to show that the MLD can be expressed as
(7) 
where denotes the Euclidean norm, and denotes the trial values of . As can be noted from (7), the MLD requires the knowledge of . Moreover, because (7) describes the detection of more than one symbol, it is typically denoted as maximum likelihood sequence detector (MLSD). If the elements of are independent, the MLSD can be replaced by a symbolbysymbol MLD
(8) 
Since perfect knowledge of is infeasible, an estimated version of , denoted as , can be used in (7) and (8) instead of . Another possible approach to implement the detector is to equalize , and then use a symbolbysymbol MLD. Given that zeroforcing equalizer is used, then the equalized received sequence can be expressed as,
(9) 
and
(10) 
It is interesting to note that solving (7) does not necessarily require the explicit knowledge of under some special circumstances. For example, Wu and Kam [25] noticed that in flat fading channels, i.e., , it is possible to detect the data symbols using the following multiplesymbol differential detector (MSDD),
(11) 
Although the detector described in (11) is efficient in the sense that it does not require the knowledge of , its bit error rate (BER) performance is very sensitive to the channel variations.
Iii Proposed System Model
One of the distinctive features of OFDM is that its channel coefficients over adjacent subcarriers in the frequency domain are highly correlated and approximately equal. The correlation coefficient between two adjacent subcarriers can be defined as
(12)  
where . The difference between two adjacent channel coefficients is
(13)  
For large values of , it is straightforward to show that and . Similar to the frequency domain, the time domain correlation defined according to the Jakes’ model can be computed as [26],
(14)  
where is the Bessel function of the first kind and order, is the maximum Doppler frequency. For large values of , , and hence , and thus . Using the same argument, the difference in the time domain . Although the proposed system can be applied in the time domain, frequency domain, or both, the focus of this work is the frequency domain.
Based on the aforementioned properties of OFDM, a simple approach to extract the information symbols from the received sequence can be designed by minimizing the difference of the channel coefficients between adjacent subcarriers, which can be expressed as
(15) 
As can be noted from (15), the estimated data sequence can be obtained without the knowledge of . Moreover, there are no constraints on the channel coefficients, and hence, the should perform fairly well even in frequencyselective fading channels. Nevertheless, it can be noted that (15) does not have a unique solution because and can actually minimize (15). To resolve the phase ambiguity problem, one or more pilot symbols can be used as a part of the sequence . In such scenarios, the performance of the will be affected indirectly by the frequency selectivity of the channel because the capability of the pilot to resolve the phase ambiguity depends on its fading coefficient. Another advantage of using pilot symbols is that it will not be necessary to detect the symbols simultaneously. Instead, it will be sufficient to detect symbols at a time, which can be exploited to simplify the system design and analysis.
Using the same approach of the frequency domain, the can be designed to work in the time domain as well by minimizing the channel coefficients over two consecutive subcarriers, i.e., two subcarriers with the same index over two consecutive OFDM symbols, which is also applicable to single carrier systems. It can be also designed to work in both time and frequency domains, where the detector can be described as
(16) 
where is an data matrix, and are the time and frequency detection window size, and the objective function is given by
(17) 
For example, if the detection window size is chosen to be the LTE resource block, then, and . Moreover, the system presented in (17) can be extended to the multibranch receiver scenarios, singleinput multipleoutput (SIMO) as,
(18) 
where is the number of receiving antennas.
Iv Efficient Implementation of
It can be noted from (16) and (17) that solving for , given that pilot symbols are used, requires an trials if brute force search is adopted, which is prohibitively complex, and thus, reducing the computational complexity is crucial. Towards this goal, the two dimensional (2D) resource block can be divided into a number of onedimensional (1D) segments in time and frequency domains. The main requirement is that each 1D segment should contain at least one pilot symbol. Fig. 1 shows an example of a possible segmentation of a 2D resource block into several 1D segments over time and frequency.
Furthermore, by noting that the expression in (15) corresponds to the sum of correlated terms, which can be modeled as a firstorder Markov process, then MLSD techniques such as the Viterbi algorithm can be used to implement the efficiently. For example, the trellis diagram of the Viterbi algorithm with binary phase shift keying (BPSK) is shown in Fig. 2, and can be implemented as follows:

Initialize the path metrics , where and denote the upper and lower branches, respectively. Since BPSK is used, the number of states is .

Initialize the counter, .

Compute the branch metric , where is current symbol index, , and , and is the next symbol index using the same mapping as .

Compute the path metrics using the following rules,

Track the surviving paths, paths in the case of BPSK.

Increase the counter, .

if , the algorithm ends. Otherwise, go to step 3.
It is worth mentioning that placing a pilot symbol at the edge of a segment terminates the trellis. To simplify the discussion, assume that the pilot value is , and thus we compute only and . Consequently, long data sequences can be divided into smaller segments bounded by pilots, which can reduce the delay by performing the detection over the subsegments in parallel without sacrificing the error rate performance.
V Error Rate Analysis of the
The system BER analysis is presented for several cases according to the pilot/data arrangements and pilot power boosting. For simplicity, each case is discussed in separate subsections. To make the analysis tractable, we consider BPSK modulation in the analysis while the BER of higher order modulations is obtained via Monte Carlo simulations.
Va SingleSided Pilot
To detect a data segment that contains symbols, at least one pilot symbol should be part of the segment in order to resolve the phase ambiguity problem. Consequently, the analysis in this subsection considers the case where there is only one pilot within the symbols, as shown in Fig. 3. Given that the FFT output vector is divided into segments each of which consists of symbols, including the pilot symbol, then the frequency domain detector of the can be written as,
(19) 
where denotes the index of the first subcarrier in the segment, and without loss of generality, we consider that . Therefore, by expanding (19) we obtain,
(20) 
which can be simplified to,
(21) 
For BPSK, , which is a constant term with respect to the maximization process in (21), and thus, they can be dropped. Therefore, the detector is reduced to
(22) 
Given that the pilot symbol is placed in the first subcarrier and noting that , then and can be written as
(23) 
The sequence error probability (), conditioned on the channel frequency response over the symbols ( and the transmitted data sequence can be defined as,
(24) 
which can be also written in terms of the conditional probability of correct detection as,
(25) 
Without loss of generality, we assume that , ,… . Therefore,
(26) 
Since has data symbols, then there are trial sequences, , ,, , where , and , ,… . The first symbol in every sequence is set to , which is the pilot symbol. By defining , where , then (26) can be written as,
(27) 
which, as depicted in Appendix I, can be simplified to
(28) 
To evaluate given in (28), it is necessary to compute , which can be written as
(29) 
Given that , ,… , then and . Therefore, , and are independent conditionally Gaussian random variables with averages , , and , respectively, and the variance for all elements is . To derive the PDF of , the PDFs of and
should be evaluated, where each of which corresponds to the product of two Gaussian random variables. Although the product of two Gaussian variables is not usually Gaussian, the limit of the momentgenerating function of the product has Gaussian distribution. Therefore, the product of two variables
and tends to be as the ratios and increase [27]. By noting that in in (29) , and and , thus . Moreover, because the PDF of the sum/difference of two Gaussian random variables is also Gaussian, then,where
(30) 
and
(31) 
Consequently,
(32) 
and
(33) 
where . Since and are independent, then, the condition on in (33) can be removed by averaging over the PDF of and as,
(34) 
where the PDFs in (34) are multivariate Gaussian distributions that can be expressed as [28],
(35) 
where is the mean vector, which is defined as,
(36) 
and is the covariance matrix that is defined as,
(37) 
Due to the difficulty of evaluating integrals, we consider the special case of flat fading, which implies that and , where is the channel fading envelope, . Therefore, the SEP expression in (33) becomes,
(38) 
Recalling the Binomial Theorem, we get
(39) 
where,
(40) 
Then the SEP formula in (38) using the Binomial Theorem in (39) can be written as,
(41) 
The conditioning on can be removed by averaging over the PDF of , which is Rayleigh,
(42) 
And hence,
(43) 
Because the expression in (38) contains high order of function , evaluating the integral analytically becomes intractable for . For the spacial case of , can be evaluated by substituting (41) and (42) into (43) and evaluating the integral yields the following simple expression,
(44) 
where is the average signaltonoise ratio (SNR), . Moreover, because all data sequences have equal probability of error, then , which also equivalent to the bit error rate (BER). It is interesting to note that (44) is similar to the BER of the differential binary phase shift keying (DBPSK) [28]. However, the two techniques are essentially different as does not require differential encoding, has no constraints on the shape of the signal constellation, and performs well even in frequencyselective fading channels.
To evaluate for , we use an approximation for in [29], which is given by
(45) 
Therefore, by substituting (45) into the conditional SEP (41) and averaging over the Rayleigh PDF (42), the evaluation of the SEP becomes straightforward. For example, evaluating the integral for gives,
(46) 
where , and is the exponential integral (EI), . Similarly, for can be evaluated to,
(47) 
where
Although the SEP is very useful indicator for the system error probability performance, the BER is actually more informative. For a sequence that contains information bits, the BER can be expressed as , where denotes the average number of bit errors given a sequence error, which can be defined as
(48) 
Because the SEP is independent of the transmitted data sequence, then, without loss of generality, we assume that the transmitted data sequence is . Therefore,
(49) 
where in this case corresponds to the hamming weight of the detected sequence , which can be expressed as
(50) 
where denotes the pairwise error probability (PEP). By noting that , then deriving the PEP for all cases of interest is intractable. As an alternative, a simple approximation is derived.
For a sequence that consists of information bits, the BER is bounded by
(51) 
In practical systems, the number of bits in the detected sequence is generally not too large, which implies that the upper and lower bounds in (51) are relatively tight, and hence, the BER can be approximated as the middle point between the two bounds as,
(52) 
The analysis of the general SIMO system is straightforward extension of the singleinput singleoutput (SISO) case. To simplify the analysis, we consider the flat channel case where the conditional SEP can be written as,
(53) 
Given that all the receiving branches are independent, the fading envelopes will have Rayleigh distribution , and thus,
will have Gamma distribution,
,(54) 
Therefore, the unconditional SEP can be evaluated as,
(55) 
For the special case of , , can be evaluated as,
(56) 
where Computing the closedform formulas for other values of and can be evaluated following the same approach used in the SISO case.
VB DoubleSided Pilot
Embedding more pilots in the detection segment can improve the detector’s performance. Consequently, it worth investigating the effect of embedding more pilots in the SEP analysis. More specifically, we consider doublesided segment, , , as illustrated in Fig. 4. In this case, the detector can be expressed as,
(57) 
From the definition in (57), the probability of receiving the correct sequence can be derived based on the reduced number of trials as compared to (23). Therefore,
(58) 
which, similar to the singlesided case, can be written as,
(59) 
Therefore,
(60) 
For flat fading channels, the SEP expression in (60) can be simplified by following the same procedure in Subsection VA, for the special case of , the SEP becomes,
(61) 
where