I introduction
In the past, conventional methods optimize the modules of communication system separately, such as encoder, modulator, to achieve the better transmission quality[1][2]. Deep Learning has experienced fast development in the past decade and it also possesses great potential in wireless communication. There have been lots of modeldriven applications based on Deep Learning[3], such as massive MIMO[4] and OFDM[5].
One important application of Deep Learning is to view communication system as an endtoend autoencoder, in which the modules can be optimized jointly. The result in[6]
has shown that autoencoders can readily match the performance of nearoptimal existing baseline modulation and coding schemes by learning the system during training. The transmitter maps a onehot vector to particular constellation symbols for transmission. The signals distorted by channel are used to reconstruct the original vector. The authors of
[7] have proved that an endtoend structure need a differential channel model to optimize the transceiver. However, onehot transmission scheme is limited because all information bits are only used to transmit one symbol, which decreases transmission efficiency seriously.Opposite to onehot transmission scheme, block scheme is a transmission scheme which allows parallel inputs. It enables communication systems to transmit a stream of information bits instead of bits for one symbol[8]. In[9], block scheme[10]is introduced in autoencoder to deal with the transmission of batches of sequences. This structure supports arbitrary length of binary sequences as input, but its performance is not good enough for practical use.
In this paper, we build up an endtoend autoencoder with block transmission scheme. In order to improve its performance, we also introduce memory mechanism into the neural networks. Our contributions are following:

We propose a novel autoencoder structure based on neural networks. It introduces block scheme to deal with sequences in the form of blocks and allows arbitrary input length, which improves transmission efficiency. With the memory mechanism of recurrent neural networks (RNN), the autoencoder explores potential relationships between blocks for modulating. Through optimizing the transmitter and receiver jointly, the constellation diagram can be learned automatically for particular modulation mode.

We train and test the model under different channel models. The performance of the proposed model is better than other autoencoderbased communication systems under typical channels[9]. At the same time, the simulation result shows that lower code rate leads to a lower bit error rate (BER).
Ii deep neural network structures
A deep feedforward network, which is also called multilayer perceptron, is a typical deep learning model
[11]. Feedforward networks define a map, and use backpropagation
[12] to learn the value of , obtaining the best nonlinear approximation of some function we need. There is no feedback between the output and the model itself. When there exists connection, it is called recurrent neural network(RNN).Given a particular amount of training samples, we send them into the networks as batches. The output is used to calculate the loss and compute the gradient. The computed gradient is broadcast back through the neural networks and the parameter vector is update according to the gradient.
There are several typical kinds of layers of neural networks.

Fullyconnected layer. Its neural units between two adjacent layers are fullyconnected. Each neural unit has an activation function to introduce nonlinearity into the network such as
and . Therefore the fullyconnected layer has a strong ability to approximate . 
Convolutional neural networks consist a series of filters called kernel. The kernels generates receptive field and extract features of input like images. Convolutional networks have been applied in some novel communication structures. In[6], CNNs accomplish classification tasks for different modulation schemes.

A long shortterm memory (LSTM) network is an artificial RNN architecture. It introduces memory mechanism and extracts relationships between time steps. LSTM can learn to bridge minimal time lags in excess of 1000 discretetime steps by enforcing constant error flow through constant error carousels within special units
[13]. The architecture of LSTM we adopt in our system is shown in Fig.1. Cells are connected recurrently to each other, replacing the usual hidden units of ordinary recurrent neural networks. An input feature is computed with a regular artificial neuron unit. Its value can be accumulated into the state if the sigmoidal input gate allows it. The state unit has a linear selfloop whose weight is controlled by the forget gate. The output of the cell can be shut off by the output gate. All the gating units have a sigmoidal nonlinearity, while the input unit can have any squashing nonlinearity
[11].
Iii system model
We build up an endtoend communication system using neural networks feeding with block data, which enables us to complete joint optimization of transceiver.
Iiia Network Structure
The structure of block autoencoder is shown in Fig.2. It consists several parts as following.

The input is a stream of bits. To solve the problem of block transmission, we set the number of blocks to , and each block has bits to be modulated, so the total length of input bits is .

In the first layer, we adopt a convolutional neural network to compress input bits into blocks. The output is sent to several LSTM layers to produce the modulated complex symbols. We combine the time distributed layer with LSTM layer in order to introduce some linear relationship between symbols. To satisfy power constraint, we normalize the output symbols at the end of the transmitter. The detailed parameters of our autoencoder are shown in table I.

Since we add the operation of encoding into the network through adjusting output dimension of timedistributed layers, the number of complex symbols should be instead of , which is dependent on the code rate we set.
Layer  Parameters 

Input  
Conv1D_1  stride=S, kernel=S, filters=128 
LSTM_1  units=400 
Time Distributed  
LSTM_2  units=128 
LSTM_3  units=64 
Conv1D_2  stride=1(default), kernel=S, filters=64 
Time Distributed  
Reshape 
Following the encoding and modulating operation, the coded sequence is transmitted over the communication channel by I and Q components of digital signal. In our model, the communication channel is nontrainable, which can be represented as .
The distorted signal
is demodulated and decoded by the receiver. These layers reconstruct the input sequence. Each trainable layer of proposed autoencoder is followed by a batch normalization layer so that the training process will converge more quickly.
IiiB Channel model

First we consider AWGN channel models. AWGN channel is used to train and test our autoencoder. We add zeromean complex Gaussian noise to the transmitted symbol
. The variance of noise is calculated by given
and block size . 
In wireless communication, frequency selective fading is a radio propagation anomaly caused by partial cancellation of a radio signal by itself. The signal arrives at the receiver by several different paths. There exists intersymbol interference (ISI) that influences the signal to be received. For generalization, we also do experiments under frequency selective fading channels. Traditional methods add protective interval to avoid or decrease ISI. However, our autoencoder is an endtoend system, so we simply increase the number of symbols instead of introducing extra artificial symbols into the end of transmitter. We train and test the models under two multipath channels. The channel models we use are shown in Fig.3. Channel A has two fading paths and the zerodelayed path is strong. Different from channel A, channel B has three fading paths, including a weak zerodelayed one.
Iv experiments
In order to obtain the performance of proposed autoencoder, we train and test the model in different scenarios. Bit error rate (BER) is a measure of the number of bit errors that occur in a given number of bit transmissions under all scenarios. For generalization, we simply select AWGN channel model. In fact, under the scenario of wireless communication, the channel would be more complex because signals arrive at the receiver through different paths which leads to ISI between symbols.
Iva Settings
For simulation, we set the block size to 6 and block number to 400. So the autoencoder acts like a joint coding and modulating 64QAM system. We compare the learned autoencoder with conventional coding and modulating method. The data sets are generated by random distributed . The number of samples is 40000 for training and 10000 for testing. We set batch size to 64 and use Adam optimizer with learning rate 0.001. We need to train the autoencoder under an SNRfixed channel. Through several experiments, we find the best training is 12dB.
IvB AWGN Channel
The performance of the autoencoder under AWGN channel is shown in Fig.4. We also implement the autoencoder in [9] for comparison. We add redundant information to resist the influence of channel through increasing the number of symbols. The way that we adjust the code rate is to set different dimension to the timedistributed layer and the convolutional layer in the decoder. When code rate is set to 1, which means the sequence is uncoded, our autoencoder performs very closely to conventional MMSE decoding method. Clearly as shown in Fig.5, our block autoencoder gives better performance than autoencoder in [9]. When we decrease the code rate to , which means we add redundant information to the encoded sequence, the autoencoder’s performance is improved rationally. When code rate is set to , we compare it with Viterbi hard decoding method in 64QAM. We can find that our autoencoder performs far beyond Viterbi hard decoding method in low SNR situation. It requires lower power to reach the same BER as Viterbi’s method.
We draw the constellation diagram of the trained autoencoder in Fig.5. We can see that the symbols plotted in complex plane are distributed in 64 clusters. In actual deployment, it is easy to transfer symbols through inphase and quadrature component according to the constellation diagram.
IvC Fading Channel
The performance under two chosen channels is shown in Fig.6. We set the code rate to 1/2 and training to 20dB. Our autoencoder performs well in the noise ranging from 5dB to 10dB but faces an error floor when is more than 15dB. Compared with channel A, channel B’s BER is higher because it contains a weaker zerodelay path is weaker.
To improve the autoencoder’s performance, we continue to decrease the code rate. As shown in Fig.7, its BER decreases when we amplify the number of symbols under the same channel B when we set training BER to 12dB. However, this will reduce the transmission efficiency so that the system is hard to be deployed on hardware. So tradeoff strategy is important.
V conclusion
In this paper, we propose a new communication structure combined with LSTMs and CNNs. The autoencoder performs better than other autoencoderbased communication systems under AWGN and multipath fading channels. A regular constellation diagram can be learned with the limit of average power, which is easier to be deployed on hardware platform. Considering the wireless transmission scenario, the autoencoder needs extra symbols to resist the channel fading. The simulation result shows that the BER of proposed autoencoder can be decreased to an acceptable range through reducing code rate. We show that we can decrease the code rate to ensure a satisfying BER. Due to the property of CNNs and LSTMs, the autoencoder has no limit on the length of input sequence. Furthermore, we prove that the training and testing process do not need a particular channel model.
We may further discover other applications based on the block autoencoder in the following aspects.

Our autoencoder is a kind of SISO system. The spectrum efficiency of SISO system is much lower than MIMO[2]. MIMO systems can enhance throughput without more bandwidth or transmit power expenditure. MIMO has become an essential element of wireless communication standards including IEEE 802.11n (WiFi), IEEE 802.11ac (WiFi), HSPA+ (3G), WiMAX (4G), and Long Term Evolution (4G LTE). Therefore, it is necessary for us to extend our system to a MIMO autoencoder.

We mention that we can increase the number of symbols to reach to an ideal BER range. For proposed autoencoder, however, the code rate should be low to achieve the acceptable performance, which means we need to add more redundant information. So it is important to design a better structure based on block autoencoder, which shows more robustness to fading channels.
References

[1]
H. Boche and E. A. Jorswieck, “Outage probability of multiple antenna systems: Optimal transmission and impact of correlation,” in
International Zurich Seminar on Communications, 2004. IEEE, 2004, pp. 116–119.  [2] H. El Gamal, G. Caire, and M. O. Damen, “Lattice coding and decoding achieve the optimal diversitymultiplexing tradeoff of mimo channels,” IEEE Transactions on Information Theory, vol. 50, no. 6, pp. 968–985, 2004.
 [3] T. Wang, C.K. Wen, H. Wang, F. Gao, T. Jiang, and S. Jin, “Deep learning for wireless physical layer: Opportunities and challenges,” China Communications, vol. 14, no. 11, pp. 92–111, 2017.

[4]
H. Huang, J. Yang, H. Huang, Y. Song, and G. Gui, “Deep learning for superresolution channel estimation and doa estimation based massive mimo system,”
IEEE Transactions on Vehicular Technology, vol. 67, no. 9, pp. 8549–8560, 2018.  [5] A. Felix, S. Cammerer, S. Dörner, J. Hoydis, and S. Ten Brink, “Ofdmautoencoder for endtoend learning of communications systems,” in 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE, 2018, pp. 1–5.
 [6] T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563–575, 2017.
 [7] V. Raj and S. Kalyani, “Backpropagating through the air: Deep learning at physical layer without channel models,” IEEE Communications Letters, vol. 22, no. 11, pp. 2278–2281, 2018.
 [8] T. J. O’Shea, T. Roy, N. West, and B. C. Hilburn, “Physical layer communications system design overtheair using adversarial networks,” in 2018 26th European Signal Processing Conference (EUSIPCO). IEEE, 2018, pp. 529–532.
 [9] B. Zhu, J. Wang, L. He, and J. Song, “Joint transceiver optimization for wireless communication phy using neural network,” IEEE Journal on Selected Areas in Communications, 2019.
 [10] A. E. Jones, T. A. Wilkinson, and S. Barton, “Block coding scheme for reduction of peak to mean envelope power ratio of multicarrier transmission schemes,” Electronics letters, vol. 30, no. 25, pp. 2098–2099, 1994.
 [11] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.
 [12] R. HechtNielsen, “Theory of the backpropagation neural network,” in Neural networks for perception. Elsevier, 1992, pp. 65–93.
 [13] S. Hochreiter and J. Schmidhuber, “Long shortterm memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.