I Introduction
RadioFrequency (RF) waves can be utilized for transmission of both information and power simultaneously. RF transmissions of these quantities have traditionally been treated separately. Currently, the community is experiencing a paradigm shift in wireless network design, namely unifying transmission of information and power, so as to make the best use of the RF spectrum and radiation, as well as the network infrastructure for the dual purpose of communicating and energizing [1]. This has led to a growing attention in the emerging area of Simultaneous Wireless Information and Power Transfer (SWIPT). As one of the primary works in the information theory literature, Varshney studied SWIPT in [2], in which he characterized the capacitypower function for a pointtopoint discrete memoryless channel. Recent results in the literature have also revealed that in many scenarios, there is a tradeoff between information rate and delivered power. Just to name a few, frequencyselective channel [3], MIMO broadcasting [4], interference channel [5].
The main challenege in Wireless Power Transfer (WPT) is to increase the DirectCurrent (DC) power at the output of the harvester without increasing transmit power. The harvester, known as rectenna, is composed of an antenna followed by a rectifier.^{1}^{1}1In the literature, the rectifier is usually considered as a nonlinear device (usually a diode) followed by a lowpass filter. The diode is the main source of nonlinearity induced in the system. In [6, 7], it is shown that the RFtoDC conversion efficiency is a function of the rectenna’s structure, as well as its input waveform (power and shape). Accordingly, in order to maximize the rectenna’s output power, a systematic waveform design is crucial to make the best use of an available RF spectrum [7]. In [7]
, an analytical model for the rectenna’s output is introduced via the Taylor expansion of the diode characteristic function and a systematic design for multisine waveform is derived. The nonlinear model and the design of the waveform was validated using circuit simulations in
[7, 8] and recently confirmed through prototyping and experimentation in [9]. Those works also confirm the inaccuracy of linear dependence of the rectifier’s output power on its input power^{2}^{2}2The linear model has for consequence that the RFtoDC conversion efficiency of the energy harvester (EH) is constant and independent of the harvester’s input waveform (power and shape) [4, 10].. As one of the main conclusions, it is shown that the rectifier’s nonlinearity is beneficial to the system performance and has a significant impact on the design of signals and systems involving wireless power.The SWIPT literature has so far, to a great extent, ignored the nonlinearity of the EH and has focused on the linear model of the rectifier, e.g., [3, 4, 5]. However, it is recognized that considering the harvester nonlinearity changes the design of SWIPT at the physical layer and medium access control layer [1]. Nonlinearity leads to various energy harvester models [7, 11, 12], new designs of modulation and input distribution [13, 14, 15], waveform [16], RF spectrum use [16], transmitter and receiver architecture [16, 14, 17] and resource allocation [11, 18, 19]. Of particular interest is the role played by nonlinearity on SWIPT signalling in singlecarrier and multicarrier transmissions [16, 13, 14, 1, 20]. In multicarrier transmissions, it is shown in [16] that inputs modulated according to the Circular Symmetric Complex Gaussian (CSCG) distributions, improve the delivered power compared to an unmodulated continuous waves. Furthermore, in [13], it is shown that for an AWGN channel with complex Gaussian inputs under average power and delivered power constraints, depending on the receiver demand on information and power, the power allocation between real and imaginary components is asymmetric. As an extreme point, when the receiver merely demands for power requirements, all the transmitter power budget is allocated to either real or imaginary components. In [14, 20]
, it is shown that the capacity achieving input distribution of an AWGN channel under average, peak and delivered power constraints is discrete in amplitude with a finite number of masspoints and with a uniformly distributed independent phase. In multicarrier transmission, however, it is shown in
[16] that nonzero mean Gaussian input distributions lead to an enlarged RatePower (RP) region compared to CSCG input distributions. This highlights that the choice of a suitable input distribution (and therefore modulation and waveform) for SWIPT is affected by the EH nonlinearity and motivates the study of the capacity of AWGN channels under nonlinear power constraints.Our interests in this paper lie in the apparent difference in input distribution for singlecarrier and multicarrier transmission, that is singlecarrier favors asymmetric inputs [13], while multicarrier favors nonzero mean inputs [16]. We aim at tackling the design of input distribution for SWIPT under nonlinear constraints using a unified framework based on nonzero mean and asymmetric distributions. To that end, we study SWIPT in a multicarrier setting subject to nonlinearities of the EH. We consider a frequencyselective channel subject to transmit average power and receiver delivered power constraints. We mainly focus on complex Gaussian inputs, where inputs of each real subchannel are independent of each other and on each real subchannel the inputs are independent and identically distributed (iid).
We are aiming at reconciling the two main observations of the previous paragraph: that is, outperforming of asymmetric Gaussian inputs and nonzero mean Gaussian inputs compared to CSCG inputs in singlecarrier transmission [13] and multicarrier transmission [16], respectively. The contributions of this paper are listed below.

First, taking the advantage of the smallsignal approximation for rectenna’s nonlinear output introduced in [16], we obtain the general form of the delivered power in terms of system baseband parameters. It is shown that, first, unlike the linear model, the delivered power at the receiver is dependent on higher moments of the channel input, such as the first, second and forth moments. Second, the amount of delivered power on each subchannel is dependent on its adjacent subchannels.

Assuming nonzero mean Gaussian inputs, an optimization algorithm is introduced. Numerical optimizations reveal that for the scenarios where the receiver is interested in both information and power, simultaneously, the inputs are with nonzero mean and nonzero variance. Two important observations are made: first, that allowing the input to be nonzero mean improves the ratepower region, significantly, and second, that for receiver demands, which concerns information and power, the power allocation between real and imaginary components of each complex subchannel is asymmetric in general. These results, can be thought of as generalization of the results in
[13] and [16], where asymmetric power allocation (in flat fading channels) and nonzero mean inputs (in frequencyselective channels) are proposed, respectively, in order to achieve larger RP regions. 
As a special scenario, we consider the optimized zero mean Gaussian inputs under the assumption of nonlinear EH. For this case, optimality conditions are derived. It is shown that (similar to nonzero mean inputs) under nonlinear assumption for the EH, the power allocation on each subchannel is dependent on other subchannels as well. Forcing the optimality conditions to be satisfied (numerically), it is observed that a larger RP region is obtained in contrast to the optimal zero mean inputs under the linear assumption for the EH.
Organization: In Section II, we introduce the system model. In Section III, the studied problem is introduced. In Section IV, The delivered power at the output of the EH is obtained in terms of system baseband parameters. In Section V, the ratepower maximization over frequencyselective channels with nonzero mean Gaussian inputs is considered. As a special case, the optimality conditions for power allocation on different subchannels are obtained for zero mean Gaussian inputs. In Section VI, WPT and SWIPT optimization for the studied problem is introduced and numerical results are presented. We conclude the paper in Section VII and the proofs for some of the results are provided in the Appendices at the end of the paper.
Notation
: Throughout this paper, random variables and their realizations are represented by capital and small letters, respectively.
and denote the expectation over statistical randomness and the average over time of the process , respectively, i.e.,(1)  
(2) 
where
denotes the Cumulative Distribution Function (CDF) of the process
. denotes circular convolution. The standard CSCG distribution is denoted by . Complex conjugate of a complex number is denoted by . and are real and imaginary operators, respectively. For a complex random variable , we denote , , , and . The moments corresponding to real and imaginary components of are represented by subscripts and , respectively, i.e., , , and and similarly for imaginary counterparts. denotes remainder of the argument with respect to . for and zero elsewhere. and . denotes the partial derivative of the function with respect to , i.e.,. The vector
is represented by . Throughout the paper, complex subchannels and their real/imaginary components are referred to as csubchannels and rsubchannels, respectively.Ii System Model
Considering a pointtopoint tap frequencyselective AWGN channel, in the following, we explain the operation of the transmitter and the receiver.
Iia Transmitter
The transmitter utilizes Orthogonal Frequency Division Multiplexing (OFDM) to transmit information and power over the channel. Let denote the modulated InformationPower (IP) complex symbols over subcarriers (csubchannels), occupying the overall bandwidth of Hz and being uniformly separated by Hz. Inverse Discrete Furrier Transform (IDFT)^{3}^{3}3In this paper we consider and for DFT and IDFT definitions, respectively. is applied over IP symbols and Cyclic Prefix (CP) is added to produce the time domain signal given by
(3) 
Next, the signal
(4) 
is upconverted to the carrier frequency and is transmitted over the channel.
IiB Receiver
The filtered received RF waveform at the receiver is modelled as
(5) 
where is the baseband equivalent of the channel output with bandwidth Hz. In order to guarantee narrowband transmission, we assume that .
Delivered Power: The power of the signal (denoted by ) is harvested using a rectenna. The delivered power is modelled as
(6) 
where and are constants^{4}^{4}4The reader is referred to [7] for detailed explanations of the model. Also note that according to [16], rectenna’s output is in the form of current with unit Ampere. However, since power is proportional to current, with abuse of notation, we refer to the term in (6) as power..
Information Receiver: The signal is downconverted and sampled with sampling frequency producing given by
(7) 
where represents a sample of the additive noise at time . is the csubchannel tap and is a sample of the signal given in (4) at time .
Considering one OFDM block, the receiver discards the CP and converts the symbols back to the frequency domain by applying DFT on (7), such that
(8) 
where is the DFT of . and are DFTs of the extended channel vector , symbols (equivalently, samples of at times ) and noise samples , respectively. That is,
(9a)  
(9b) 
and similarly for . We assume as iid and CSCG random variables with variance , i.e., for . The channel frequency response is assumed to be known at the transmitter.
Iii Problem statement
We aim at maximizing the rate of transmitted information, as well as the amount of delivered power at the receiver, given that the input in each csubchannel
is distributed according to a nonzero mean complex Gaussian distribution. We also assume that in each csubchannel the real and imaginary components are independent. Accordingly, the optimization problem consistes in the maximization of the mutual information between the channel input
and the channel output (see eq. 8) under an average power constraint at the transmitter and a delivered power constraint at the receiver, such that, and for . Hence, we have(10)  
s.t. 
where and are the average power and mean of the csubchannel, respectively. is the available power budget at the transmitter. is the minimum amount of average delivered power at the receiver. Maximization is taken over all the means and powers () of independent complex Gaussian inputs , such that the constraints are satisfied.
Iv Power metric in terms of channel baseband parameters
In this section, we study the delivered power at the receiver based on the model in (6). Note that most of the communication processes, such as, coding/decoding, modulation/demodulation, etc, are done at the baseband. Therefore, from a communication system design point of view, it is most preferable to have baseband equivalent representation of the system. Henceforth, in the following Proposition, we derive the delivered power at the receiver in terms of system baseband parameters. For brevity of representation, we neglect the delivered power from CP, and also we assume that
is odd (calculations can be easily extended to even values of
, following similar steps). The following proposition, expresses the delivered power in (6) in terms of the channel and its input baseband parameters.Proposition 1.
Given that the inputs on each rsubchannel are iid and that the inputs on different rsubchannels are independent, the delivered power at the receiver can be expressed in terms of the channel baseband parameters and statistics of the channel input distribution as
(11) 
where is odd and , , , and are defined as
(12a)  
(12b)  
(12c)  
(12d)  
(12e)  
(12f)  
(12g) 
with being the samples of the channel at times between two consecutive information samples (for more details see AppendixAB).
Proof: See AppendixA.
Remark 1.
We note that as also mentioned in Proposition 1, the delivered power is based on the assumption that the inputs on different rsubchannels are independent as well as being iid on each rsubchannel. Obtaining a closed form expression for the delivered power at the receiver when the inputs on different rsubchannels are not iid is cumbersome. This is due to the fact that the fourth moment of the received signal creates dependencies among the inputs of different rsubchannels. As another point, we note that in the calculations for the delivered power in Proposition 1, we neglect the delivered power from CP. This along with the aforementioned assumptions on the input distributions, bears the fact that the real delivered power (based on the introduced model in (6)) is larger than (11). Indeed, the subscript in (11) stands for inner bound in order to express this point.
Remark 2.
Note that similar results in [13] are reported for singlecarrier AWGN channel, where the delivered power is dependent on higher moments of the channel input. In [16], superposition of deterministic and CSCG signals are assumed for multicarrier transmissions with the assumption that the receiver utilizes power splitter. Part of the signal is used for power transfer and the other part is used for information transmissions^{5}^{5}5We note that the model considered for signal transmission in this paper is different from the multisubband orthogonal transmission considered in [16].. In comparison to the results in [16], we note that, here, the channel input is generalized in the sense that it allows asymmetric power allocation across all rsubchannels. Also, at the receiver, no power splitter is assumed^{6}^{6}6This scenario considered in this paper can be considered as an optimistic upperbound on the system performance, since (so far) in practice, it is not possible to decode information and harvest power from the same signal, jointly..
V RatePower Maximization Over Gaussian Inputs
In this section, we consider the SWIPT optimization problem in (10). We obtain the optimality conditions in their general form (assuming nonzero mean inputs) to be used in Section VI in order to obtain (locally) optimal power allocations for different rsubchannels. In order to better understand the problem, the optimality conditions are specialized for zero mean Gaussian inputs, analytically.
Va SWIPT with nonzero mean complex Gaussian inputs
Assuming that the inputs of csubchannels are in general with nonzero mean, the problem in (10) can be rewritten as follows
(13)  
s.t. 
where , , , . Note that for a Gaussian distribution in the function , we have , .
In Section VI, we consider the numerical optimization of problem (13) by considering its Lagrangian^{7}^{7}7The problem in (13) is not convex and any solution obtained from solving the dual problem is in general a local optima.. The KKT conditions for problem (13) are detailed in AppendixB. As it can be seen from the KKT conditions in AppendixB, unfortunately, it is cumbersome to derive analytical results on the optimal solution of problem (13). However, it can be shown that for the optimal solution, the average power constraint is satisfied with equality (see AppendixB for the details).
As explained in Section VI, numerical results reveal that nonzero mean asymmetric complex Gaussian inputs result in larger RP region compared to their zero mean counterparts. However, in order to better understand the problem in its general form (assuming nonzero mean), it is beneficial to look into the optimality conditions of zero mean inputs.
VB SWIPT with zero mean complex Gaussian inputs
In the following, we obtain the optimality conditions for power allocation among different rsubchannels, when the input distributions are complex Gaussian with zero mean and with independent components.
Lemma 1.
If , the optimal power allocation for problem (13) satisfies the average power and delivered power constraints with equality, i.e.,
(14a)  
(14b) 
with . Also for the optimal vectors we have
(15a)  
(15b) 
with
(16) 
for some
(17a)  
(17b) 
and . For , the optimal power allocations are simplified to waterfilling solution, i.e.,
(18) 
Proof: See AppendixC.
Remark 3.
Note that the delivered power in the csubchannel for zero mean Gaussian inputs, i.e.,
is dependent on other csubchannels through ^{8}^{8}8Note that for zero mean inputs with nonlinear EH, yields the same delivered power/ transmitted information as .. This is in contrast with the linear model, where the delivered power is obtained as .
Remark 4.
The optimality conditions of Lemma 1 in (15) can be interpreted as follows. The functions are positive and convex (the Hessian matrix is positive definite). Also note that is a mirrored version of with respect to the surface . Assume that is given and that is chosen as a large value (so that it satisfies (17a)). Consider the intersection of the horizontal surface with functions and for some index . Depending on the value of and shape of the functions , different pairs of satisfy simultaneously
(19) 
The number of these solution pairs for each index can be verified to vary from three to four. That is, if , there are three solutions, and if , there are four solutions for (19)^{9}^{9}9Note that must satisfy the condition in (17a) as well.. In Figure 1, an illustration of the intersection of the aforementioned three surfaces for a specific index is provided, where four pairs of solutions are recognized. In Figure 2, the same illustration is presented along the axis from the top. Points and denote the solution pairs that satisfy (19). Note that depending on the average power constraint, some (or all) of the points and are not admissible (for example, here is not admissible). If there is no point satisfying the average power constraint, the power allocated to the corresponding csubchannel is zero (in order to satisfy (15)). Otherwise, there are more than one set of power allocations that satisfy the optimality necessary conditions. Accordingly, the power allocation could be either symmetric (corresponding to either of the points ) or asymmetric (corresponding to either of the points ). Note that both points and contribute the same amount in the delivered power and transmitted information (as noted in Remark 3). Therefore, they can be chosen interchangeably.
Remark 5.
The optimality conditions in (15) can be solved numerically using programming for solving nonlinear equations with constraints ( for ). In Section VI, it is observed through numerical optimization that for mere WPT purposes (equivalently large values of ) all the available power at the transmitter is allocated to either real or imaginary component of the strongest csubchannel. Additionally, note that, (for zero mean Gaussian inputs), although optimized for WPT, the amount of transmitted information is never zero.
Vi Numerical Optimization
In this section, we provide numerical results regarding the power allocation for different rsubchannels under a fixed average power and different delivered power constraints in order to obtain different RP regions corresponding to different types of complex Gaussian inputs introduced earlier.
Via Nonzero mean inputs
We note that, the optimization problem in (10) is not convex, and accordingly, the final solution (obtained via numerical optimization) is in general a local stationary point. Due to nonconvexity of the studied problem, the final solution is dependent on the initial starting point. In order to alleviate the effect of the initial point, in our optimization, we first focus on the WPT aspect of the optimization problem with deterministic input signals^{10}^{10}10We note that although we first optimize over deterministic signals for WPT, optimizing over means and powers for SWIPT results in the same solutions, i.e., signals with almost zero variance, however, in the expense of a long simulation time. Therefore, for the starting point of the RP region, we chose the input to be deterministic., i.e., the variance of different rsubchannels are close to zero with a good approximation. In this case, with deterministic input signals we have , for . Therefore, the delivered power reads as
(20) 
Accordingly, we consider the following WPT problem
(21)  
s.t. 
where the proof for the average power constraint satisfied with equality has been provided in AppendixB. The algorithm (WPT optimization with deterministic inputs) is run for a large number of times (here we run the algorithm 1000 times) using the Matlab command fmincon(), and each time with a new and randomly generated initial complex mean vector . After this stage, the solution corresponding to the highest delivered power is chosen as the initial starting point for the SWIPT optimization.
Next, in order to solve the optimization for SWIPT, we consider the following maximization, which is the weighted summation of the transmitted information and the delivered power
(22)  
s.t. 
We solve this problem using the Matlab command fmincon() as follows. is given different values, starting from larger ones^{11}^{11}11Note that can be interpreted as . Therefore, intuitively a larger value of corresponds to a higher delivered power and lower transmitted information.. For the first round of the optimization (corresponding to the largest value of ), the (locally) optimal solution obtained through previous optimization (WPT with deterministic inputs) is used as the starting point (the power for different rsubchannels is considered as ). Similarly, for the subsequent values of , we use the solution corresponding to the previous value of . The detailed description of the optimization is presented in Algorithm 1.
ViB Zero mean inputs
In order to obtain the optimal power allocations for zero mean complex Gaussian inputs, we follow a similar approach presented in Section VIA. The optimization problem considered here is given as
(23)  
s.t. 
The optimization is explained in Algorithm 2^{12}^{12}12We note that, as an alternative approach, the optimality conditions in Lemma 1 can be used in order to find the optimal power allocations. To do so, solving the nonlinear equations (14) and (15) have to be considered with the constraints . Accordingly, one can use the MATLAB command fsolve(). The optimization is initialized with a very small (in norm) power vector and each time the vector is updated until a condition on convergence is met..
ViC Numerical results
In this section, we present the results obtained through numerical simulations. First, we focus on the optimized RP regions corresponding to different types of channel inputs. Later, we compare the constellation of optimized nonzero mean and zero mean complex Gaussian inputs on different points of their corresponding optimized RP region.
In Figure 3, the RP regions for Asymmetric Nonzero mean Gaussian (ANG), presented in this paper and Symmetric Nonzero mean Gaussian (SNG) presented in [16] and Zero mean Gaussian (ZG) are shown^{13}^{13}13The channel we have used for our simulations comprises csubchannels with coefficients as .. We also obtain the RP region corresponding to the optimal power allocations for the linear model assumption of the EH. This is done by obtaining the power allocations from [3, Equation (9)] for different constraints and calculating the corresponding delivered power and transmitted information. This region is denoted by Zero mean Gaussian for Linear model (ZGL). As it is observed in Figure 3, due to the asymmetric power allocation in ANG, there is an improvement in the RP region compared to SNG. Additionally, it is observed that ANG and SNG achieve larger RP region compared to optimized ZG and that performing better than ZGL (highlighting the fact that for scenarios that the nonlinear model for EH is valid, ZGL is not optimal anymore). The main reason of improvement in the RP regions corresponding to ANG, SNG is due to the fact that allowing the mean of the channel inputs to be nonzero boosts the fourth order term (More explanations can be found in [16].) in (6), resulting in more contribution in the delivered power at the receiver.
In Figure 4, from left to right, the optimized inputs in terms of their complex mean (represented as dots) and their corresponding rsubchannel variances (represented as ellipses) are shown for points and in Figure 3, respectively. Point represents the maximum delivered power with the zero transmitted information (note that information of a deterministic signal is zero). Point represents the performance of a typical input used for power and information transfer. Finally, point represents the performance of an input obtained via waterfilling (when the delivered power constraint is inactive). From these plots it is observed that as we move from point to point , the mean of different rsubchannels decrease, however, they (means of different rsubchannels) keep their relative structure, roughly. Also, as we move to point , the means of different rsubchannels get to zero with their variances increasing asymmetrically until the power allocation gets to waterfilling solution (where the power allocation between the real and imaginary components are symmetric). This result is in contrast with the results in [16], where the power allocation to the real and imaginary components in each csubchannel is symmetric. Similar results regarding the benefit of asymmetric power allocation has also been reported in [13] for deterministic AWGN channel with nonlinear EH.
In Figure 3, the point corresponds to the input, where all of the csubchannels other than the strongest one (in terms of the ) are with zero power. For the strongest csubchannel, at point , all the transmit power is allocated to either real or imaginary component of the csubchannel. The reason for this observation is explained in Remark 5. This observation is also inline with the result of [13], where it is shown that for a flat fading channel, the maximum power is obtained by allocating all the transmitter power to only one rsubchannel. Note that this is different from the power allocation with the linear model (i.e. ZGL), for which all the transmit power would also be allocated to the strongest csubchannel to maximize delivered power but equally divided among the real and imaginary parts of the input.
In Figure 5, the variances of different rsubchannels corresponding to the point in Figure 3 are illustrated. Numerical optimization reveals that, as we move from point to point (increasing the information demand at the receiver) in Figure 3, the variance of the strongest csubchannel varies asymmetrically (in its real and imaginary components). This observation can be justified as follows. For higher values of (equivalent to higher delivered power demands), the strongest csubchannel receives a power allocation similar to the solutions or in Figure 2, whereas the other csubchannels take the power allocation corresponding to the point in Figure 2^{14}^{14}14For very low average power constraints, it is observed that the power allocation is symmetric across all the csubchannels. This can be justified by noting that for very low average power constraints, the admissible power allocations correspond to solutions similar to the point D in Figure 2. . Note that the power allocation in point is the waterfilling solution. .
Remark 6.
In Figure 6 (using the optimization algorithm, explained earlier in Algorithm 1) the RP regions are obtained for . It is observed that the delivered power at the receiver is increased by the number of the csubchannels . This is due to the presence of input moments (higher than 2) in the delivered power in (11), and is inline with observations made in [7, 16]^{15}^{15}15We note that, in practical implementations, this observation (increasing delivered power with ) cannot be valid for all , and the delivered power is saturated after some . This is due to the diode breakdown effect, which has not been considered in our model (6) due to small signal analysis. This is further discussed in [16]..
As another interesting observation, in Figure 7, the numerically optimized inputs for WPT (under the assumption of flat fading for the channel) are illustrated for . As mentioned in Algorithm 1, for each , the optimization is run for many times, each time fed with a randomly generated starting point. In Figure 7, the optimized inputs for WPT purposes (zero variance inputs) are illustrated. The phases of the mean on different csubchannels are also equally spaced.