I Introduction
Physicallayer network coding (PNC) is a promising approach for increasing a throughput in wireless relay networks with twoway, or more generally, multiway communication flows. Several protocols for twoway wireless relaying schemes have been proposed and analyzed in the literature [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
. These approaches can be largely classified depending on the number of steps it takes for user terminals to exchange their packets via a relay in the middle. For example, amplifyandforward (AF) is a wellknown approach for 2step protocols where two terminals send their packets to a relay in the first step, and the relay simply broadcasts the amplified version of the received signal. On the other hand, in 3step decodeandforward schemes, both terminals send their packets in different steps to avoid interference, and then the relay broadcasts the signal after decoding the packets from each terminal. In this paper, we investigate twoway relaying with 2step protocols, which potentially achieves a higher throughput than other protocols due to its time efficiency.
Optimized constellations for twoway relaying with a 2step protocol based on denoiseandforward (DNF) has been investigated in [11]. It is shown that optimized nonlinear network coding in terms of the Euclidean distance property offers significant improvement of system throughput compared to conventional network coding based on exclusiveor (XOR). However, the major challenge of the strategy therein is that the Euclidean distance of the constellation is not necessarily the most important performance measure in practical systems, where binary softdecision channel decoding is employed. More specifically, for such systems, generalized mutual information (GMI) may be the appropriate measure that should be maximized. Furthermore, since the network coding in [11] is highly optimized for various factors, such as channel coefficients and the combination of constellations transmitted from two terminals, it requires a large lookup table to implement and thus it may be infeasible as the constellation size increases.
Inspired by the recent advancements of deep learning (DL) techniques, the application of DL has been widely studied in communication systems [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. For example, DNNbased signal detection [13, 14], joint sourcechannel coding [15], channel decoding [16, 17]
and DNNbased autoencoder
[18, 19, 20, 21] were investigated. More recently, endtoend learning of communication systems based on generative adversarial networks was introduced in [22].In this paper, we investigate a new application of DL to optimizing constellations for twoway relaying capable of nonlinear PNC. We consider twoway wireless relaying with 2step protocols, where all modulation/demodulation at terminals and a relay are performed by DNNs. Unlike the conventional strategy for constellation optimization based on the Euclidean distance in the signal space, our objective of training is to minimize the cross entropy, and thus it directly maximizes the GMI, which is an important metric for a system employing softdecision forward error correction. Our proposed DNNbased approach can be easily extended to the higher order constellations, and furthermore, the output of the DNN demodulator can be immediately fed into the softdecision channel decoder at the receiver, which is attractive in practice because we do not require an additional converter to generate loglikelihood ratios.
The contributions of this paper are summarized as follows:

We propose a new nonlinear network coding approach based on DNN for twoway relay networks.

We design modulation and demodulation functions via DL such that the cross entropy loss is directly minimized in an endtoend manner.

We extend our DNN design to high order modulation.

We demonstrate from simulation results that our DLbased approach significantly improves the achievable sum rate over the conventional counterparts.
The rest of this paper is organized as follows: The general system model of twoway relay networks with 2step protocols is introduced in Section II. Section III describes the proposed relay model based on the DNN modulation/demodulation and their learning process. In Section IV, we evaluate the performance of the proposed scheme in terms of the achievable sum rates, where better performance than conventional relaying is observed. Finally, concluding remarks are given in Section V.
Ii TwoWay Relay Systems with 2Step Protocols
Fig. 1 shows the system model for 2step twoway wireless relaying, where terminal A has packets for terminal B, and vice versa. The relay node R performs PNC to assist the data exchange, but it is neither a traffic source nor a sink for simplicity.
Iia Multiple Access (MA) Step
Letting be a signal mapping function, the transmitting signals at terminals A and B are given by and , respectively, where and are binary source data per symbol at each terminal. We assume that the mapping functions for terminals A and B are identical, and also, neither terminal can receive anything from the other terminal during the MA stage due to the half–duplex constraint. Then the received signal at the relay node R is expressed as , where and are complex channel coefficients from each terminal, and is additive white Gaussian noise (AWGN).
IiB Broadcast (BC) Step
The received signal at relay R, , is processed by the relay function , and then broadcasted to terminals A and B. This function depends on a specific protocol, for example, in the case of the conventional AF relaying, this function is just a linear scaling function. We consider nonlinear PNC functions for , designed by DL. We denote the transmitting signal from the relay R as . The received signals at terminals A and B are then given by and , respectively, where and are AWGNs. For simplicity, we assume the reciprocal channel for both steps. Finally, letting be the demodulator function, terminals A and B detect their desired data by the demodulator as and , respectively. Note that the node A can exploit what he sent () to demodulate , and similarly for node B.
IiC Main Challenges in TwoWay Relay Protocols
The major challenge of the design of PNC for 2step twoway relaying lies in how to deal with the interference between the signals from two terminals in the MA step. In the conventional AF relaying scheme, the relay just amplifies the received signal and then broadcasts it to the terminals A and B in the BC step. This AF relaying can cause noise enhancement. Furthermore, transmitting the amplified version of the summation of and (plus noise) to each terminal may be inefficient, since terminals A and B already have their own information and , respectively.
To cope with these issues, DNF was proposed in [23], where the relay first detects two signals from the terminals A and B by maximumlikelihood (ML) detection, and then maps into a discrete signal constellation. In order to show the fundamental concept of DNF, let us assume the scenario where each terminal sends BPSK signals to the relay in the MA step, i.e., . For simplicity of explanation, we assume and a noiseless channel. In this case, the possible received signal at the relay is . If the relay receives , there exists residual ambiguity and the relay cannot detect and . However, we may use the following denoising map in the BC step: for and otherwise. When the terminal A has sent at the MA stage, the relay will transmit if the terminal B also has sent , and is transmitted if the terminal B has sent . This is how the terminal A can successfully decide the information of the terminal B, and vice versa in DNF.
The design strategy of the denoising map for DNF scheme that optimizes the Euclidean property has been investigated in [11]. It is shown in [11] that the optimization of signal constellations for PNC schemes in twoway relay networks offers the significant performance gain over conventional XORbased network coding in terms of the endtoend throughput. However, the major challenge of this strategy includes efficient implementation when the constellation size increases, since it requires a large number of denoising mapping patterns. Furthermore, the Euclidean distance is not necessarily of primary interest when channel codes are used. For this reason, we propose the DNNbased modulation/demodulation scheme, that directly minimizes the bitwise cross entropy, or equivalently, maximizes the bit likelihood. Furthermore, the proposed DNNbased scheme may be suited for efficient hardware implementation and scalable to higher level modulation.
Iii Deep Learning for PNC in TwoWay Relay
In this section, we introduce the DNNbased twoway relay system and describe the learning process for DNNbased modulator/demodulator/denoiser optimization. Fig. 2 depicts the proposed system, where we have three DNNs that are jointly trained.
Iiia DNN Architecture
IiiA1 Modulator at Terminals
Let denote the DNN modulator function at the terminals, which maps binary bits into the constellation in the dimensional real space , i.e.,
. We use the multilayer perceptron (MLP) with multiple hidden layers, each of which has a number of nonlinear activation nodes. At the last layer, the signal power of
is normalized with a batch normalization layer.
IiiA2 PNC Modulator at Relay
Let denote the DNN modulator function at the relay, which maps the realvalued
dimensional vector
into the constellation in the dimensional real space , i.e., . Note that the DNN modulator based on MLP can perform as a nonlinear denoising map, capable of PNC function. Analogously to , we control the signal power with batch normalization at the output of the DNN modulator.IiiA3 Demodulator at Terminals
A DNN demodulator function at terminals, denoted as , maps the signal in the dimensional real space into the dimensional probability vector . Since at each terminal, their own information are available, these information are used as an input to the DNN by appending as and at terminals A and B, respectively. The outputs of the hidden layers is scaled into the range of at the last sigmoid layer, which is interpreted as a likelihood.
Throughout this work, we use feedforward DNNs with 2 hidden layers with 1000 nodes, except for the last layer of the transmitter and receiver, which are linear and sigmoid, respectively. For simplicity, we assume in this work to represent inphase and quadrature signaling as an extension to higherdimensional modulations is straightforward.
IiiB EndtoEnd Learning
The objective of the training is to minimize the cross entropy loss between the binary data transmitted from each terminal and the decoded information at each terminal after relaying. In order to do this, we jointly train three DNNs, i.e., the modulator at the terminal, the PNC modulator at the relay, and the demodulator at the terminal.
With stochastic gradient descent, we minimize the following binary cross entropy loss:
(1) 
where the expectation is taken over all transmitting bit and demodulator output across the training data. Here, we let
denote the demodulator output prior to the sigmoid function as
. As shown in the following subsection, minimizing (1) is equivalent to maximizing the achievable rate in terms of GMI.IiiC Relationship Between Cross Entropy and GMI
The cross entropy given in (1) is minimized by increasing the logprobability corresponding to the information bit, which is equivalent to improving the bit loglikelihood ratio (LLR). The normalized GMI in data communications is given as
(2) 
where is the LLR value corresponding to the information bit . From (1) and (2), we can see that the GMI is a function of cross entropy over empirical ensemble: . Therefore we attempt to minimize the bitwise cross entropy loss for maximizing GMI so that the achievable sum rates in twoway relaying systems can be maximized.
IiiD Learning Fading from CSI
When the channel state information (CSI) is available at each terminal and relay, we can incorporate the channel coefficients and as an input to the modulators. For example, when the channel coefficients and are available at the relay, we can expand the input dimension of the DNN modulator at the relay as
(3) 
Rather than such a simple concatenation of side information, it was found that further expansion of the input dimension with various different patterns of the data set leads to better performance. For example, the input dimensions of the DNN modulator at the relay may be extended by using the original input multiplied by channel coefficients as
(4) 
We note that increasing input dimension may decelerate the learning convergence of the DNN, while improves the performance. In this paper, in addition to (4), we use () as the input to the DNN modulator at the relay when perfect CSI is available.
Similarly, we may use channel coefficients as inputs to the DNN demodulator at terminals A and B. We assume that the CSI is used only for demodulation (not for modulation) at the terminals when it is available.
Iv Simulation Results
In this section, we evaluate the performance of the proposed DNNbased PNC scheme. We compare the performance of the proposed scheme with that of the conventional AF relaying in both the AWGN and frequencyflat block Rayleigh fading channel. We vary an average SNR for the performance evaluation, where
is the variance of the Gaussian noise. The channel power ratio between
and is assumed to be dB.We use chainer [24] for the DNN implementation. We set a minibatch size as and generate
pseudo random bits as a training data. The epoch size is chosen as
in all cases. In each epoch the gradient of the loss function is calculated over the whole training data using Adam
[25] with learning rate . The DNN modulator/demodulator are trained using randomly generated channel coefficients and Gaussian noises, where the same channel coefficients are used over a minibatch, while Gaussian noise varies from symbol to symbol. Note that our system is trained through randomly varying channel coefficients so that the designed PNC works even for practical channels.As mentioned earlier, bit/block error rate may not be the appropriate performance measure from the fact that the practical wireless system typically uses softdecision forward error correction such as lowdensity paritycheck codes. For this reason, we focus on the achievable sum rate for the performance evaluation as
(5) 
where comes from the number of terminals and is the number of bits per symbol.
The training SNR is an important parameter when we train the proposed DNNbased system. In the following, we train the DNN at every
dB, and then plot the best point among them for a given channel SNR. In practice, this can be regarded as the performance with the adaptive selection of mapping/demapping functions depending on the estimated channel SNR.
Iva AWGN Channel
In this subsection, we evaluate the performance of the proposed scheme over the AWGN channel in terms of the achievable sum rate. We use rectified linear unit (ReLU) activations in the DNN architecture. As a benchmark, we also consider an ideal case where
and are available at the relay DNN . Although this ideal DNN scenario may not be practical, it is useful for seeing an upperbound on the room for performance improvement by enhancing the detection of the information and at the relay.IvA1 4Qam
Fig. 3 shows the performance comparison of the proposed DNNbased scheme and the conventional AF relaying over AWGN channel in terms of the achievable sum rate defined as (5). Here, we consider two bits per symbol constellation, i.e., 4QAM. From this figure, it is observed that the proposed scheme outperforms the conventional AF relaying by more than dB. In particular, the proposed scheme offers the significant performance gain over the AF relaying systems in the low SNR region, where the improvement of bps/Hz is observed in the achievable sum rate at an SNR of 0 dB. Also we can see from this figure that the proposed scheme has a large performance gap from the ideal case, which is approximately dB. This indicates that the performance of proposed scheme has a potential improvement if the relay detects the information and more accurately.
Fig. 4 shows the signal constellation that the relay broadcasts to terminals A and B in the BC step. In this figure, the Gaussian noise at the relay is not shown. From this figure, it is observed that when the relay knows the information and (ideal case), the relay broadcasts a 4QAM constellation to terminals in all cases, which is equivalent to XOR denoising [11]. Meanwhile, when the relay has no such information and the channel SNR is low, it sends ary constellation. This indicates that the proposed DNN automatically controls the transmission rate, depending on the channel condition. For the training SNR of – dB, the relay sends QAM, which is similar to the conventional AF scheme (since the result for the training SNR – dB is similar to that of dB, they were omitted).
IvA2 16Qam
Fig. 5 shows the performance with 16QAM signaling over the AWGN channel. We observed from this figure that also in this case, the proposed scheme significantly outperforms the AF relaying scheme in the low SNR region, since it flexibly controls the transmission rate depending on the channel condition. On the other hands, in the high SNR region, proposed DNN is outperformed by the conventional AF scheme. This is because the AF performs the optimal ML detection at terminals A and B, while in our proposed scheme, the DNN demodulators try to approximate ML detection with a nonlinear function. Also, the proposed scheme still has a large performance gap from the ideal case.
In Fig. 6 we show the signal constellation that the relay sends to terminals A and B in the BC step. For the purpose of plotting the DNN modulator , the Gaussian noise is again not shown. For the training SNR of dB, the DNN compresses the constellation to QAM in the ideal case, while the relay broadcasts QAM to terminals A and B when it has no information about and . The DNN without the information still sends QAM when the training SNR is dB, whereas the ideal case transmits ary constellation. For relatively higher training SNRs, e.g., dB, constellations become circular for both cases, while the ideal case compresses the constellation more effectively (similar results are observed for training SNR of dB). For even higher training SNRs, the ideal case sends QAM, which is equivalent to XOR denoising [11]. On the other hands, the DNN without the information sends almost QAM constellation, which is equivalent to the AF scheme.
IvB FrequencyFlat Rayleigh Fading Channel
In what follows, we show the simulation results for QAM signaling over frequencyflat block Rayleigh fading channels, where channel coefficients are identical in a minibatch. We use activations in this subsection. We consider the following three scenarios in the following:

No CSI is available: denoted by “DNN (no CSI).”

Perfect CSI is available at both terminals and the relay. Additionally, and are available at the relay: denoted by “DNN (Ideal).”

Perfect CSI is available at both terminals and the relay: denoted by “DNN.”
The achievable sum rate for frequencyflat Rayleigh fading with 4QAM is shown in Fig. 7. In this figure, we can see a significant performance gain greater than dB by the proposed scheme over the AF scheme in the low SNR region. Also, we observe that even in the high SNR region, e.g., dB, the proposed scheme still has a gap from the ideal case, which indicates that the proposed scheme has a room to improve the performance by improving the detection at the relay. It is also observed that the achievable sum rate of the proposed scheme with no CSI case does not increase more than about the half of the best achievable sum rate for this setting. This is because the amplitude and phase information in signal constellation can be useless for unknown CSI, and an onoff keying type constellation might have been learned.
V Conclusion
In this paper, we proposed a new application of deep learning to PNC in wireless relay networks, where the signal constellation mapper/demapper is performed by DNNs. The DNN is trained such that the GMI is directly maximized, which is an appropriate performance measure for practical wireless systems with softdecision channel decoders. Simulation results demonstrated that the proposed DNNbased PNC scheme offers a significant performance gain over the conventional AF relaying systems to achieve higher sum rates.
Since it was observed that there may exist a potential room for the significant performance improvement by enhancing the detection at the relay, multitask learning [26] for taking both the endtoend cross entropy and the detection error at the relay into account may be helpful. Also, since our work in this paper is limited to the design of the constellation mapper/demapper, our future work may include the joint design of channel coding for the DNNbased relaying network.
References
 [1] J. N. Laneman, D. N. Tse, and G. W. Wornell, “Cooperative diversity in wireless networks: Efficient protocols and outage behavior,” IEEE Trans. Inf. Theory, vol. 50, no. 12, pp. 3062–3080, 2004.
 [2] P. Popovski and H. Yomo, “The antipackets can increase the achievable throughput of a wireless multihop network,” in Proc. 2006 IEEE International Conference on Communications (ICC), vol. 9, 2006, pp. 3885–3890.
 [3] ——, “Physical network coding in twoway wireless relay channels,” in Proc. 2007 IEEE International Conference on Communications (ICC), 2007, pp. 707–712.
 [4] S. J. Kim, P. Mitran, and V. Tarokh, “Performance bounds for bidirectional coded cooperation protocols,” IEEE Trans. Inf. Theory, vol. 54, no. 11, pp. 5235–5241, 2008.
 [5] B. Rankov and A. Wittneben, “Spectral efficient protocols for halfduplex fading relay channels,” IEEE Journal on Selected Areas in Communications, vol. 25, no. 2, 2007.
 [6] R. H. Louie, Y. Li, and B. Vucetic, “Practical physical layer network coding for twoway relay channels: performance analysis and comparison,” IEEE Trans. Wireless Commun., vol. 9, no. 2, 2010.
 [7] M. P. Wilson, K. Narayanan, H. D. Pfister, and A. Sprintson, “Joint physical layer coding and network coding for bidirectional relaying,” IEEE Trans. Inf. Theory, vol. 56, no. 11, pp. 5641–5654, 2010.
 [8] S. Zhang and S.C. Liew, “Channel coding and decoding in a relay system operated with physicallayer network coding,” IEEE J. Sel. Areas Commun., vol. 27, no. 5, pp. 788–796, 2009.
 [9] T. KoikeAkino, P. Popovski, and V. Tarokh, “Denoising strategy for convolutionallycoded bidirectional relaying,” in Proc. 2009 IEEE International Conference on Communications (ICC), 2009, pp. 1–5.
 [10] S. C. Liew, S. Zhang, and L. Lu, “Physicallayer network coding: Tutorial, survey, and beyond,” Physical Communication, vol. 6, pp. 4–42, 2013.
 [11] T. KoikeAkino, P. Popovski, and V. Tarokh, “Optimized constellations for twoway wireless relaying with physical network coding,” IEEE J. Sel. Areas Commun., vol. 27, no. 5, 2009.
 [12] H. Kim, Y. Jiang, R. Rana, S. Kannan, S. Oh, and P. Viswanath, “Communication algorithms via deep learning,” arXiv preprint arXiv:1805.09317, 2018.
 [13] N. Farsad and A. Goldsmith, “Neural network detection of data sequences in communication systems,” arXiv preprint arXiv:1802.02046, 2018.
 [14] ——, “Sliding bidirectional recurrent neural networks for sequence detection in communication systems,” arXiv preprint arXiv:1802.08154, 2018.
 [15] N. Farsad, M. Rao, and A. Goldsmith, “Deep learning for joint sourcechannel coding of text,” arXiv preprint arXiv:1802.06832, 2018.
 [16] T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, “On deep learningbased channel decoding,” in Proc. 2017 51st Annual Conference on Information Sciences and Systems (CISS), 2017, pp. 1–6.
 [17] F. Liang, C. Shen, and F. Wu, “An iterative BPCNN architecture for channel decoding,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 144–159, 2018.
 [18] T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,” IEEE Trans. on Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575, 2017.
 [19] S. Dörner, S. Cammerer, J. Hoydis, and S. ten Brink, “Deep learning based communication over the air,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 132–143, 2018.
 [20] B. Karanov, M. Chagnon, F. Thouin, T. A. Eriksson, H. Bülow, D. Lavery, P. Bayvel, and L. Schmalen, “Endtoend deep learning of optical fiber communications,” arXiv preprint arXiv:1804.04097, 2018.
 [21] F. A. Aoudia and J. Hoydis, “Endtoend learning of communications systems without a channel model,” arXiv preprint arXiv:1804.02276, 2018.
 [22] H. Ye, G. Y. Li, B.H. F. Juang, and K. Sivanesan, “Channel agnostic endtoend learning based communication systems with conditional GAN,” arXiv preprint arXiv:1807.00447, 2018.
 [23] P. Popovski and H. Yomo, “Bidirectional amplification of throughput in a wireless multihop network,” in Proc. 2006 IEEE Vehicular Technology Conference (VTCSpring), vol. 2, 2006, pp. 588–593.

[24]
S. Tokui, K. Oono, S. Hido, and J. Clayton, “Chainer: a nextgeneration open
source framework for deep learning,” in
Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Twentyninth Annual Conference on Neural Information Processing Systems (NIPS)
, 2015.  [25] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [26] R. Caruana, “Multitask learning,” Machine learning, vol. 28, no. 1, pp. 41–75, 1997.
Comments
There are no comments yet.