Nowadays the burst-mode transmission is pervasively applied in modern communication systems, such as Internet of Things (IoT) , wireless local area networks (WLAN) , etc. In burst-mode communication system (BCS), frame synchronization (FS) is the foundation of the overall system performance, and is always assumed to be obtained at the receiver. However, the BCS has a large number of non-linear devices or blocks, e.g., high power amplifier (HPA), digital to analog converter (DAC), etc., inevitably causing nonlinear distortion , 
. Usually, synchronization precedes channel estimation, signal demodulation, etc., and thus first encounters these nonlinear distortions, degrading receiver’s FS performance (e.g., the error probability performance). Owing to the lack of considerations for nonlinear distortion, the existing methods (e.g., correlation-based FS, etc) are facing great challenges.
In recent years, machine learning has drawn considerable attention due to its prominent ability to cope with nonlinear distortion, 
. The machine learning, in particular deep learning (DL) has been applied in wireless communication, e.g., signal detection, precoding , channel state information (CSI) feedback , channel estimation ,  etc. Yet, very limited works are focused on DL-based FS. One related work  investigated the DL-based timing synchronization, yet shows a higher timing error probability than conventional matched filtering. In addition, these DL-based approaches suffer from many difficulties such as complex parameter tuning, and long-time training , etc.
Unlike the DL-based approaches, the extreme learning machine (ELM) is a single-hidden layer feed-forward neural network, i.e., the gradient back-propagation (BP) is not required, possessing many advantages, e.g., randomly generating for input weight and hidden bias, fast learning speed (hundreds of times faster than that of BP algorithm), and good generalization performance, etc.,, . Inspired by these advantages, an ELM-based FS is proposed in this paper to improve the training sequence-based method, e.g., correlation-based FS . Due to the loss of training sequence’s orthogonality, the training sequence-based FS is difficult to apply in the scenario of nonlinear distortion. In the proposed method, a preprocessing is first performed to coarsely capture the features of synchronization metric (SM) by using empirical knowledge. Then, an ELM network is employed to alleviate system’s nonlinear distortion and improve SMs. Compared with the correlation-based FS  and recent FS method in , the proposed method can effectively reduce the error probability of FS for the cases with nonlinear distortion. Furthermore, with the parameter impacts, the proposed method shows a stable improvement given the change of system parameters.
The remainder of this paper is structured as follows: In Section ii@, we briefly describe the system model. In Section iii@, the ELM-based FS method is specifically presented, and the numerical simulation and analysis are given in Section iv@, the Section v@ concludes our work.
Notations: Bold lowercase and uppercase letters denote vectors and matrices respectively; italicized letters denote variables;, , and denote the transpose, conjugate transpose, matrix inversion, Moore–Penrose pseudoinverse, respectively; is vector with zero elements; is the Frobenius norm; denotes the absolute value of and denotes the absolute value operation to the every elements of vector .
Ii System Model
Considering a frame-based BCS, the transmitted frame format is illustrated in Fig. 1(a), which consists of training symbols , empty symbols , and data symbols . To guarantee the training symbol and data symbol are allocated the same transmitted power, , , , is considered in this paper. The transmitted frame is formed by , where is the frame length. Similar to , empty symbols are employed to mitigate the multi-path channel dispersion. Fig. 1(b) presents the system model, in which the nonlinear distortion (due to the existence of nonlinear blocks or devices, such as HPA, DAC, etc ) is encountered by the frame , and then the distorted signals are transmitted. At the receiver, the observation of transmitted training sequence , denoted as , can be expressed as follows 
is the complex additive white Gaussian noise (AWGN) vector whose entries are with zero-mean and variance. The complex matrix , which consists of the distorted and shifted version of transmitted frame , could be defined as
where stands for the variant of the training symbol due to the nonlinear distortion, and is the size of search window. In equation (1), represents the extended vector of channel impulse response (CIR), which can be written as
where is the frame boundary offset to be estimated with . In (3), denotes the finite CIR vector of samples memory, where represents the complex-valued CIR of the th path.
Iii Elm Frame Synchronization
In this section, a preprocessing for FS is first described in section III-A, followed by an ELM network (given in section III-B). According to , the error probability of DL-based timing synchronization is far higher than that of matched filtering, while similar behaviors are also observed in FS experiments where the ELM-based networks are employed for the scenarios with or without nonlinear distortion. Thus, a preprocessing is employed to capture the coarse features of SM.
Iii-a Preprocessing of Frame Synchronization
where represents the elements of from to . In this paper, the existing methods for computing SMs, e.g., cross-correlation based method in , are viewed as empirical knowledge. It should be noted that besides the cross-correlation based SM, other SMs could also be applied in our method with the similar processing. Denoting the cross-correlation based SM as
then the SM vector can be given by
For easy operation of ELM network, we normalize in (6) as
Then, is used as the input of ELM network. We will employ ELM network for FS to decrease nonlinear distortion and improve SMs, which is elaborated in the following subsection.
Iii-B ELM-based Frame Synchronization
The ELM-based FS includes offline and online procedures, which are elaborated in Table I and Table II, respectively.
| Given a training set
, hidden neuron number
and real-valued activation function, the training steps are summarized as follows:
For offline training, samples, i.e., , , are collected to form a training set, where is the offset label of the th sample. Denoting the offset of the th sample as , the label can be encoded according to one-hot mode, i.e.,
As shown in Table I, during the offline training procedure, the input weight and hidden bias of ELM network are randomly chosen, where is the hidden neuron number. Then, the hidden layer output can be given by
where16], etc. By collecting , a training output matrix can be constructed as
From the training set , the training labels can be used to form a label matrix , i.e.,
According and , the output weight can be given by
The main task for offline training of ELM network is to learn the output weight . With the learned output weight , the chosen input weight and hidden bias , the ELM network can implement online running, which is given in Table II.
|With the learned output weight , the chosen input weight and hidden bias , the online running steps of ELM network are summarized as follows:|
For online running, the input of ELM-based FS network (i.e., the metric vector ) is obtained by employing the preprocessing, i.e., the equations from (5) to (7). Then, is fed into the trained ELM-based FS network, which produces a network output as
By expressing as , the estimation of frame boundary offset can be given by
To sum up, the ELM-based FS network is employed to improve SMs, which can overcome multi-path interfere and nonlinear distortion.
Iv Numerical Simulation
To verify the proposed ELM-based FS can improve the error probability performance, we compared it with the classical correlation-based FS  and the recent novel method in  when the nonlinear distortion is encountered. Besides, it is also necessary to validate the robustness and generalization of the performance.
The basic parameters involved are listed below. The training sequence is Zadoff-Chu sequence , , , , , , and . The decibel (dB) form of signal-noise-ratio (SNR) and the error probability of FS are defined as  and , respectively. The multi-path Rayleigh fading channel with an exponentially-decayed power coefficient (denoted as ) 0.2 is considered. For fair comparison with , the same situation is considered, i.e., except the first path, each of the following paths is set as zero-valued with a probability of 0.5. Note that, the proposed ELM-based FS is applicable regardless of the sparsity of the channel. For nonlinear distortion, we consider the effects of HPA in this paper. The nonlinear amplitude and phase are respectively adopted from 
According to , , , , and are considered in the simulations.
For simplicity, we use “Prop”, “Corr” and “Ref_” to denote the proposed ELM-based FS, the correlation-based FS in , and the “CL-OMP” FS method in , respectively. In addition, “FS_Learn” is used to denote the FS method that an ELM is employed to learn FS from the received observation in (1), i.e., without the preprocessing procedure given in III-A.
Iv-a Error Probability Performance of FS
The effectiveness of the proposed ELM-based FS is validated in terms of the error probability curves in Fig. 2. It could be observed that the error probabilities of “Corr” and “Ref_” are much higher than that of “Prop” during the relatively high SNR, e.g., dB. Meanwhile, the error probability of “FS_Learn” is higher than those of “Corr”, “Ref_” and “Prop”. That is, without the preprocessing procedure given in III-A, the input of ELM network is the received observation in (1) rather than in (7), and thus cannot work well. It also reflects that the importance of preprocessing in ELM-based FS. In addition, the recent FS in  is almost not applicable due to its poor error probability even at a relatively high SNR (e.g., at dB), while the proposed ELM-based FS achieves a relatively low error probability to retains the feasibility for practical applications for a relatively high SNR (e.g., at dB). As a whole, the proposed ELM-based FS shows improvement of reducing error probability compared with “Corr” and “Ref_”.
Iv-B Robustness Analysis
Usually, the error probability of synchronization is influenced by the number of multi-path (i.e., ), the length of training sequence (i.e., ), the length of transmitted frame (i.e., ), and different HPAs (i.e., different values of nonlinear distortion). To illuminate the robustness of improvement under nonlinear distortion, Fig. 3(a), Fig. 3(b), Fig. 3(c) and Fig. 3(d) are given to demonstrate the impacts against , , , and different HPAs, respectively. Except for the change of the impact parameters (i.e., only , , and the parameters of HPA are changed for Fig. 3(a), Fig. 3(b), Fig. 3(c) and Fig. 3(d), respectively), other basic parameters remain the same as Fig. 2 during the simulations.
Iv-B1 Robustness against
To demonstrate the impact of on robustness, Fig. 3(a) shows the error probability of FS, where , and are considered. It is observed from Fig. 3(a), the improvement of reducing error probability is more significant with a smaller . With the increase of , the error probabilities increase for all cases (i.e., “Corr”, “Ref_” and “Prop”), due to the stronger multi-path interference. Even so, the error probability of “Prop” is much lower than those of “Corr” and “Ref_”, especially for dB. As a result, compared with those of “Corr” and “Ref_”, the proposed ELM-based FS exhibits the robust improvement of reducing error probability against varying .
Iv-B2 Robustness against
Fig. 3(b) plots the error probability of FS with different (i.e., , and ). From Fig. 3(b), a lower error probability of FS can be obtained as increases for the cases of “Corr”, “Ref_” and “Prop”. This is because a longer training sequence is more effective for overcoming multi-path interference in the given scenario . The error probability of “Prop” is lower than those of “Corr” and “Ref_”, especially for the high SNR regime (e.g., dB). By utilizing the proposed ELM-based FS, the error probability is lower than those of “Corr” and “Ref_”, and the change of shows less impact on this improvement.
Iv-B3 Robustness against
To validate the effectiveness against the impact of , the error probability curves are illustrated in Fig. 3(c), where , and are considered, respectively. As decreases, the error probabilities of “Corr”, “Ref_” and “Prop” slightly decrease due to the reduced locations for index search (since ). From Fig. 3(c), the error probability of “Prop” is lower than those of “Corr” and “Ref_” given different values of . This reflects that the error probability is reduced and the improvement is robust against varying .
Iv-B4 Robustness against different HPAs
Besides the HPA mentioned above (denoted as HPA1), an additional HPA (denoted as HPA2), which parameters are set as , , , and , is also employed in Fig. 3(d) to observe the influence of HPA on “Prop”. From , the root mean-square (RMS) errors of nonlinear amplitude and phase of HPA1 (HPA2) are 0.012 (0.041) and 0.478 (0.508), respectively. According to the RMS errors, HPA1 has less distortion than HPA2, and thus brings “Prop”, “Corr” and “Ref_” lower error probability of FS. Especially, for both HPA1 and HPA2, the error probability of “Prop” is obviously lower than those of “Corr” and “Ref_”. Therefore, the proposed ELM-based FS can work well with HPA1 and HPA2.
Iv-C Generalization Analysis
Iv-C1 Generalization against
In Fig. 4(a), the trained networks of and are respectively employed to test the cases where , , and . From Fig. 4(a), the performance of error probability is degraded when the testing is not the training . Even so, the error probability of “Prop” is obviously lower than those of “Corr” and “Ref_”. Therefore, for the cases where testing is not training , the “Prop” still improves the error probabilities of “Corr” and “Ref_”.
Iv-C2 Generalization against
The error probability performance for the case where the testing is not the training as plotted in Fig. 4(b). In Fig. 4(b), the training is 0.2, while the testing is 0.3. According to the error probability of FS, this influence is not obvious for “Prop”. Besides, the error probability of “Prop” is obviously lower than those of “Corr” and “Ref_”. Thus, the “Prop” possesses a good generalization performance against .
In this work, we investigated the ELM-based FS to improve the performance of burst-mode communication systems. A preprocessing is first performed to capture the coarse features of SM, followed by an ELM network to reduce system’s nonlinear distortion and recover SMs. Compared with the existing methods, the proposed ELM-based FS is validated with its robustness and generalization by reducing error probability. In this paper, the difficulty of obtaining desired labels is simplified by generating them according to the existing channel model. In our future works, we will consider the desired FS labels in real channel scenarios to promote the application of machine learning-based FS in practical systems (such as IoT, WLAN, etc) with nonlinear-distortion.
-  Y. Kuo, C. Li, J. Jhang and S. Lin, “Design of a wireless sensor network-based IoT platform for wide area and heterogeneous applications,” IEEE Sensors J., vol. 18, no. 12, pp. 5187–5197, Jun. 2018.
-  H. Zhang, Y. Hou, Y. Chen and S. Li, “Analysis of simplified frame synchronization scheme for burst-mode multi-carrier system,” IEEE Commun. Lett., vol. 23, no. 6, pp. 1054–1056, Jun. 2019.
-  J. Guerreiro, R. Dinis and P. Montezuma, “Analytical performance evaluation of precoding techniques for nonlinear massive MIMO systems with channel estimation errors,” IEEE Trans. Commun., vol. 66, no. 4, pp. 1440–1451, Apr. 2017.
-  J. Sun, W. Shi, Z. Yang, J. Yang and G. Gui, “Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems,” IEEE Trans. Veh. Technol., vol. 68, no. 11, pp. 10348–10356, Nov. 2019.
-  F. Ling and J. Proakis, Synchronization in Digital Communication Systems. Cambridge, U.K.: Cambridge Univ. Press, 2017.
-  J. Liu, K. Mei, X. Zhang, D. Ma and J. Wei, “Online extreme learning machine-based channel estimation and equalization for OFDM systems,” IEEE Commun. Lett., vol. 23, no. 7, pp. 1276–1279, Jul. 2019.
-  H. Ye, Y. Li and B. Juang, “Power of deep learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
-  H. Huang, Y. Song, J. Yang, G. Gui, and F. Adachi, “Deep-learning-based millimeter-wave massive MIMO for hybrid precoding,” IEEE Trans. Veh. Technol., vol. 68, no. 3, pp. 3027–3032, Mar. 2019.
-  C. Qing, B. Cai, Q. Yang, J. Wang and C. Huang, “Deep learning for CSI feedback based on superimposed coding,” IEEE Access, vol. 7, pp. 93723–93733, Jul. 2019.
-  H. He, C. Wen, S. Jin and G. Li, “Deep learning-based channel estimation for beamspace mmWave massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 852–855, Oct. 2018.
H. Huang, J. Yang, H. Huang, Y. Song and G. Gui, “Deep learning for super-resolution channel estimation and DOA estimation based massive MIMO system,”IEEE Trans. Veh. Technol., vol. 67, no. 9, pp. 8549–8560, Sept. 2018.
-  T. O’Shea, K. Karra and T. Clancy, “Learning approximate neural estimators for wireless channel state information,” in Proc. IEEE 27th Int. Workshop Machine Learn. Signal Process., Tokyo, Japan, Sept. 2017, pp. 1–7.
-  G. Huang, Q. Zhu and C. Siew, “Extreme learning machine: theory and applications,” Neurocomputing., vol. 70, no. 1–3, pp. 489–501, Dec. 2006.
-  G. Huang, Q. Zhu and C. Siew, “Extreme learning machine: a new learning scheme of feedforward neural networks,” in Proc. Int. Joint Conf. Neural Netw., Budapest, Hungary, Jul. 2004, vol. 2, pp. 985–990.
-  B. Lopes, S. Catarino, N. Souto, R. Dinis and F. Cercas, “Robust joint synchronization and channel estimation approach for frequency-selective environments,” IEEE Access, vol. 6, pp. 53180–53190, Sept. 2018.
-  T. Luong, Y. Ko, A. N. Vien, D. Nguyen and M. Matthaiou, “Deep learning-based detector for OFDM-IM,” IEEE Wireless Commun. Lett., vol. 8, no. 4, pp. 1159–1162, Aug. 2019.
-  D. Chu, “Polyphase codes with good periodic correlation properties,” IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 531–532, Jul. 1972.
-  X. Yao, Y. Liu and G. Lin, “Evolutionary programming made faster,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 82–102, Jul. 1999.
-  A. Saleh, “Frequency-independent and frequency-dependent nonlinear models of TWT amplifiers,” IEEE Trans. Commun., vol. 29, no. 11, pp. 1715–1720, Nov. 1981.