I Introduction
Future InternetofThings (IoT) networks are envisioned to comprise an enormous number of mobile devices (e.g., sensors) for enabling bigdata applications such as smart cities and realizing edge artificial intelligence (AI) [1, 2]. Therefore, how to efficiently aggregate massive data distributed at dense devices is becoming increasingly important. A technology called overtheair computation (AirComp) has recently emerged as a promising multiaccess scheme for such fast wireless data aggregation (WDA), especially in ultralowlatency and highmobility scenarios (e.g., for patrolling drones or emergency medical devices) [3]. In AirComp, a fusion center (FC) can exploit the overtheair signalsuperposition property of a multipleaccess channel (MAC), for efficiently averaging the simultaneously transmitted data by many devices in an analog manner. With proper preprocessing at devices and postprocessing at the FC, AirComp can perform not just averaging but compute a class of socalled nomographic functions
of distributed data such as geometric mean and polynomial. Recently, the applications of AirComp have expanded from sensor networks towards networks supporting edge machine learning
[4, 5].The idea of AirComp was first proposed in the pioneering work [6], where structured codes were designed for reliable functional computation over a MAC by exploiting the interference caused by simultaneous transmissions. Building upon this work, it was proved that the simple uncoded analog transmission is optimal to achieve the minimum distortion in a network with independent and identically distributed (i.i.d.) Gaussian data sources [7]
. However, coding was generally required for distortion minimization in other scenarios with, e.g., bivariate Gaussian distributed
[8] and correlated Gaussian distributed data sources [9]. In another line of research, AirComp under uncoded analog transmission was investigated from the signal processing perspective. For instance, in wireless sensor networks, an optimal linear decentralized estimation under static channels was studied in
[10]; while optimal distortion outage performance for AirComp was investigated in [11], where the outage happens when the estimation error or distortion exceeds a predetermined threshold. A socalled “AirShare” design in [12] was presented for resolving the synchronization problem in distributed transmission, in which the FC can broadcast a shared clock to all devices. More recently, the multipleinputmultipleoutput (MIMO) technique was exploited in [13, 14]to enable vectorvalued AirComp targeting multimodal sensing, in which zeroforcing precoding at sensors and aggregation beamforming at FC are jointly designed for minimizing the computation error measured by the
mean square error (MSE). Along this vein, the authors in [15] integrated MIMO AirComp with the wireless power transfer (WPT) technique to enable selfsustainable AirComp for lowpower devices. Besides, a blind MIMO AirComp without requiring channel state information (CSI) was proposed in [16] for lowcomplexity and lowlatency IoT networks. In addition, the AirComp had also found various applications beyond distributed sensing in wireless sensor networks. For example, a computeandforward scheme for wireless relaying systems was presented in [17], in which based on a similar idea like AirComp, the relay computes linear functions of messages sent from sources and forwards them to destinations for robust decoding. Moreover, the authors in [4, 5] applied AirComp in the federated learning system to enable communicationefficient collaborative AImodel training by exploiting distributed data at edge devices.To enable reliable AirComp over fading channels, it is crucial to adapt the devices’ transmit power to channels for coping with channel distortion caused by noise and fading, thereby achieving the desired signalmagnitude alignment at the FC. Despite extensive research, most prior work on AirComp assumes the simple channelinversion power control (or equivalently zeroforcing precoding in the MIMO case) at devices, such that their transmitted signals are perfectly aligned in magnitude at the FC receiver to yield the desired function [13, 14, 15, 4, 5]. However, the scheme is suboptimal and can severely degrade the AirComp performance in the presence of deep fading. Specifically, it is well known that channel inversion may significantly enhance noise, especially when devices are subject to stringent power constraints as it is the typical case for sensors. It has been well established that optimal power control over fading channels is crucial to approach the performance limit of wireless communication systems (e.g., for achieving the capacity in pointtopoint fading channels [18, 19] and cognitive radio systems [20, 21, 22]). For the same reason, it is equally important to investigate optimal power control for AirComp over fading channels. This topic, however, is a largely uncharted area. Although there exist relevant studies on the optimal power control to minimize the MSE for decentralized estimation [10, 11], they focus on the static channel scenario with a total power constraint for all devices. The scenario does not fit the AirComp system of our interest. This motivates the current work.
In this paper, we consider an IoT network with AirComp over a fading MAC, in which mobile devices simultaneously transmit the sensing data to the FC for averaging. We investigate the problem of optimal power control for reliable AirComp. Our objective is to minimize the computation error (measured by MSE) by jointly optimizing the transmit power at each device and the signal scaling factor for noise suppression, called denoising factor, applied at the FC, subject to devices’ individual average power constraints. The MSE minimization requires optimally balancing the tradeoff between reducing the error in signal alignment (required for averaging) and suppressing the noise. This leads to a challenging optimization problem due to its nonconvexity arising from the coupling of the control variables, namely the transmit powers and denoising factor. Through the joint design, the optimal power control contributes an additional dimension for reducing the computation error in AirComp. The main contributions of this work are summarized as follows.

Optimal power control for static channels: In this case, the channels are fixed in time but vary over devices. The original problem is reduced to a nonconvex optimization problem. The derived optimal powercontrol policy in closed form is found to have a thresholdbased structure. The threshold is applied on a derived quality indicator
at each device which accounts for both the power budget of the device and its channel power gain. It is shown that if the quality indicator exceeds the optimized threshold, the device applies channelinversion power control; otherwise, it performs full power transmission. Furthermore, the asymptotic analysis is provided to show the behaviour of the optimal policy for high and low
signaltonoise ratio (SNR) regimes. To be specific, the optimal power control reduces to channel inversion and full power transmission in the high and low SNR regimes, respectively. 
Optimal power control for timevarying channels: In this case, the channels vary both in time and over devices. Due to the coupling of the power control at devices and timevarying denoising factor, the formulated problem is a more challenging nonconvex problem. But the problem is shown to satisfy the “timesharing” condition, and thus the strong duality holds between the original problem and its dual problem. This motivates us to leverage the Lagrangeduality method to solve the problem. As a result, the optimal powercontrol policy over devices and fading states is derived to have the structure of regularized channel inversion, where the regularization has the function of balancing the signalmagnitude alignment and noise suppression. Moreover, for the special case with only one device being power limited, we show that at the optimality, this device should adopt a powercontrol policy combining channelinversion and waterfilling, while other devices (with sufficiently large power budgets) should transmit with channelinversion power control.

Simulation validation: Simulation results are presented to validate the derived analytical results. It is shown that the optimal power control substantially improves the AirComp performance, compared with heuristic designs such as uniform or channelinversion power control. Furthermore, it is observed that the performance gain over the uniform power control is marginal in the low SNR regime but becomes substantial as the SNR increases. This finding is in sharp contrast with the conventional waterfilling power control for rate maximization targeting communication over fading channels (see, e.g., [18]), where the adaptive power control is crucial in the low to moderate SNR regimes.
The remainder of the paper is organized as follows. The fading AirComp system is modeled in Section II, in which the power control problem for MSE minimization is formulated. The optimal powercontrol policy in the special case with static channels is presented in Section III. The optimal powercontrol policy in the general case with timevarying channels is obtained in Section IV. Finally, simulation results are provided in Section V, followed by the conclusion in Section VI.
Ii System Model and Problem Formulation
This paper considers the AirComp over a MAC as shown in Fig. 1, in which one FC aggregates the information from a set of mobile devices each with one antenna. We consider the case with block fading channels, where the wireless channels remain unchanged within each time slot but may change from one slot to another. Let denote the set of time slots of interest, where is the number of slots that is assumed to be sufficiently large. It is assumed that the channel coefficients over different time slots are generated from a stationary and ergodic stochastic process. Suppose that the devices record a set of timevarying parameters of an environment, and the FC needs to recover the average of the measured data from devices^{1}^{1}1Other functions such as geometric mean are also possible with preprocessing and postprocessing [3].. Let denote the data measured by device at an arbitrary time slot . Thus the function of interest at the FC at time slot is given by
(1) 
Instead of directly transmitting , we find it convenient to transmit its normalized version, denoted by , to facilitate power control. The function denotes the normalization operation, which is linear and uniform to all devices, to ensure that
are with zero mean and unit variance. Upon receiving the average of transmitted data
, i.e.,(2) 
the desired can be simply recovered from by applying the following denormalization operation:
(3) 
where denotes the inverse function of . Due to the onetoone mapping between and , we, hereafter, refer to as the targetfunction value at slot for ease of exposition.
At each time slot , let and denote the channel coefficient from device to the FC and the transmit coefficient at device , respectively. The received signal at the FC is
(4) 
where denotes the additive white Gaussian noise (AWGN) at the FC receiver with zero mean and variance of . Assume , where denotes the transmit power at device at each time slot and denotes the conjugate operation. Hence, we have^{2}^{2}2For simplicity, we assume the availability of perfect CSI at each device transmitter and the FC receiver. Therefore, the timevarying phase shifts introduced by channel fading can be perfectly compensated, which allows us to focus on the transmit power control only.
(5) 
Upon receiving the signal in (5), we suppose that the FC applies a signal scaling factor for noise suppression, called denoising factor, denoted by , to recover the average message of interest as
(6) 
where the postprocessing is employed in (6) for the averaging purpose. We are interested in minimizing the distortion of the recovered average of the transmitted data, with respect to (w.r.t.) the ground true average . The distortion at a given time slot is measured by the corresponding instantaneous MSE defined as
(7) 
where the expectation is over the distribution of the transmitted signals . Then, for sufficiently large (or ), the timeaveraging MSE of interest is given by
(8) 
Note that the channel coefficients over slots are assumed to be generated from a stationary and ergodic stochastic process. Let denote the channel vector collecting the wireless channel coefficients from the device transmitters to the FC receiver, or the fading state [22]. Let
denote the probability, or the fraction of time, when the channel is in the fading state
. Accordingly, we define as the expectation of any function over fading states. By using the stationary and ergodic nature of the fading channels, the timeaverage MSE in (8) can be translated to the following ensembleaverage MSE:(9) 
where and denote device ’s transmit power and the FC’s denoising factor at any fading state , respectively.
Furthermore, in practice, each device is constrained by an average power budget . Therefore, we have the devices’ individual average transmit power constraints as
(10) 
Our objective is to minimize in (9), by jointly optimizing the power control at devices and the denoising factors at the FC, subject to the individual average transmit power constraints at devices in (10). Therefore, the optimization problem of interest is formulated as (P1) in the following, where we omit the constant coefficient in (9) for notational convenience.
It is observed that the objective function of problem (P1) (i.e., the ensembleaverage MSE) consists of two components representing the signal misalignment error (i.e., ) and the noiseinduced error (i.e., ), respectively. In general, enlarging can reduce the noiseinduced error component but lead to an increased signal misalignment error (due to limited transmit power at devices), while reducing can suppress the signal misalignment error (as less power is required for signal alignment) but at the cost of enhancing the noiseinduced error. Therefore, in problem (P1) of our interest, there exists a fundamental tradeoff between minimizing the signal misalignment error and suppressing the noiseinduced error. Moreover, due to the coupling of the transmit power and denoising factors , problem (P1) is nonconvex in general, and thus challenging to solve.
Iii Optimal Power Control under Static Channels
In this section, we consider the special case of problem (P1) under static channels, namely remains unchanged over time. Accordingly, we suppose that different devices apply fixed transmit power and the FC applies a fixed denoising factor, i.e., . Furthermore, to facilitate the subsequent derivation, for each device , we define quality indicator as the product of its power budget and channel power gain (i.e., ). Then we make the following assumption without loss of generality:
(11) 
As a result, problem (P1) is reduced to the following power control problem:
However, problem (P2) is still nonconvex due to the coupling of and . To overcome the challenge, we first optimize over under any given , and then search for the optimal . The derived optimal solution exhibits an interesting thresholdbased structure as elaborated in the following subsection.
Iiia ThresholdBased Optimal Power Control
In this subsection, we first present the optimal solution to problem (P2) by optimizing over under any given , and then searching for the optimal . First, we consider the optimization over under a given . In this case, we decompose problem (P2) into the following subproblems each for optimizing :
(12) 
where the constant term w.r.t. is ignored. It is evident that the optimal powercontrol policy to problem (12) for device is given by
(13) 
Next, we proceed to optimize over . By substituting in (13) back to problem (P2), we have the optimization problem over as
(14) 
with denoting the objective function w.r.t. . To solve problem (14), we need to remove the “min” operation to simplify the derivation. To this end, we find it convenient to adopt a divideandconquer approach that divides the feasible set of problem (14), namely , into intervals. Note that each of them is defined as
(15) 
where we define and for notational convenience. Then, it is easy to establish the equivalence between the following two sets
(16) 
Given (16), we note that solving problem (14) is equivalent to solving the following subproblems and comparing their optimal values to obtain the minimum one.
(17) 
with denoting the objective function of the th subproblem. Notice that when , we have . In addition, we have
(18) 
More specifically, after solving each subproblem in (17), we can obtain the optimal solution to problem (14) by comparing their optimal values in (17), where denotes the optimal denoising factor in each interval . Let denote the globally optimal solution for problem (14). First, we have the following lemma.
Lemma 1.
Given (11), the optimal denoising factor must satisfy , such that device (that with the smallest quality indicator) should always transmit with full power, i.e., .
Proof:
See Appendix A. ∎
Lemma 1 suggests that we only need to focus on solving each th subproblem in (17) with , for which we have the following lemma.
Lemma 2.
For any , the function is a unimodal function that first deceases in and then increases in , where is the stationary point given by
Therefore, the optimal solution to the th subproblem in (17) is given by
(19) 
Proof:
See Appendix B. ∎
By comparing the optimal values of the subproblems in (17) obtained in Lemma 2, we can obtain the optimal solution to problem (14). Suppose that , where
(20) 
Accordingly, the optimal solution to problem (P2) is derived as follows.
Theorem 1.
With defined in (20), the optimal power control over static channels that solves problem (P2) has a thresholdbased structure, given by
(21) 
where the threshold is given as
(22) 
Furthermore, it holds that for devices and for devices .
Proof:
See Appendix C. ∎
Remark 1 (ThresholdBased Optimal Power Control).
Based on the optimal solution to problem (P2), we have the following insights on the optimal power control for AirComp over devices in static channels. The optimal powercontrol policy over devices has a thresholdbased structure. The threshold is specified by the denoising factor and applied on the derived quality indicator accounting for both the channel power gain and power budget of device . It is shown that for each device with its quality indicator exceeding the threshold, i.e., , the channelinversion power control is applied with ; while for each device with , the full power transmission is deployed with . As shown in (22), the optimal threshold is a monotonically increasing function w.r.t. the noise variance . This says that a larger leads to more devices transmitting with full power and vice versa. The result is intuitive by noting that also plays another role as the denoising factor: a large requires a large for suppressing the dominant noiseinduced error.
IiiB Alternative Solution Method
Though we have found the optimal solution above, in this subsection we further provide an indepth analysis of to reveal more insights that can be exploited for developing faster algorithms to search for .
Lemma 3.
Proof:
See Appendix E. ∎
The three properties of shown in Lemma 3 suggest three different ways to search for a unique . Each property involves simple computation of and without the need to further solve the subproblems in (17) to find , which significantly reduces the complexity of solving problem (P2). Notice that the fastest one may be the one using property that only involves computing sequentially and finding the minimum. Once is found, we can directly obtain the optimal transmission power and denoising factor by applying Theorem 1.
The correctness of the derived results in Theorem 1 and Lemma 3 is further verified by an experiment under typical system settings plotted in Fig. 2. Specifically, it is observed from Fig. 2(a) that , as it follows with . Furthermore, for any device , we have ; while for any device , it holds that . These observations are consistent with the three properties obtained in Lemma 3. Besides, as shown in Fig. 2(b), devices transmit with full power while the others transmit with channelinversion power control, which confirms the results derived in Theorem 1.
IiiC Asymptotic Analysis
In this subsection, we analyze the derived optimal powercontrol policy in two extreme regimes, namely the high and low SNR regimes, respectively.
In the high SNR regime, i.e., , it is easy to know from problem (14) that the MSE is dominated by the misalignment error. Thus we can expect that the channelinversion power control is optimal as it minimizes the misalignment error. Aligned with the intuition, it follows from Lemma 1 that should be the global minimizer solving problem (14), and correspondingly we have in Theorem 1. Therefore, the optimal power control and denoising factor are given by
(23) 
In the low SNR regime with , the noiseinduced error becomes dominant, and thereby it is desirable to maximize to suppress the noiseinduced error, i.e., must hold. Then we have in Theorem 1. In this case, all devices should transmit with full power to achieve the optimal . The optimal power control and denoising factor in this case are
(24) 
In summary, the above analysis suggests that the optimal power control reduces to the channel inversion and full power transmission in the high and low SNR regimes, respectively. The analytical findings will be substantiated by simulation results in Section V.
Iv Optimal Power Control under TimeVarying Channels
In this section, we present the optimal powercontrol policy from solving problem (P1) under timevarying channels. In the following, we first consider the general case with any power budget values of , and then consider a special case with only one device being power limited (i.e., ) for gaining useful design insights.
Iva Optimal Power Control over Devices and Fading States
Despite the nonconvexity of problem (P1), it can be shown to satisfy the socalled “time sharing” condition in [23]. Therefore, strong duality holds between problem (P1) and its Lagrange dual problem [24], and thus we leverage the Lagrangeduality method to optimally solve problem (P1). Let denote the dual variable associated with the th constraint in (10), . Then the partial Lagrangian of problem (P1) is
The dual function is
(25) 
Accordingly, the dual problem of problem (P1) is given as
(26) 
Since the strong duality holds between problems (P1) and (D1), one can solve problem (P1) by equivalently solving its dual problem (D1) [24]. For notational convenience, let and denote the optimal primal solution to problem (P1), and denote the optimal dual solution to problem (D1). In the following, we first evaluate the dual function under any given feasible , and then obtain the optimal dual solution to maximize .
First, we obtain by solving problem (25) under any given , which can be decomposed into a sequence of subproblems each for one particular fading state as follows:
(27) 
Note that problem (27) is nonconvex. To solve this problem, we first consider the optimization over under given , which can be decomposed into subproblems as follows, each for one device :
(28) 
Lemma 4.
The optimal solution to problem (28) is given as
(29) 
By replacing with in Lemma 4, problem (27) becomes a singlevariable optimization problem over :
(30) 
Let , and thus problem (30) becomes a convex optimization problem over , which is shown as
(31) 
with . By checking the firstorder derivative of the objective function in problem (31) w.r.t. , it follows that the optimal solution to problem (31) must satisfy the following equality:
(32) 
Based on (32), the optimal can be efficiently found by a bisection search, though a closedexpression is not admitted. As a result, the optimal solution to problem (30) is obtained as . With the optimal in hands, we can get the optimal power control based on (29) in Lemma 4, by replacing as . Therefore, problem (27) is finally solved under any given , and the dual function is also accordingly obtained.
Next, we search over to maximize for solving problem (D1). Since the dual function is concave but nondifferentiable in general, one can use subgradient based methods such as the ellipsoid method [25], to obtain the optimal for problem (D1). Notice that for the objective function in problem (25), the subgradient w.r.t. is , . By replacing in Lemma 4 as the obtained optimal , the optimal solution to (P1) is presented as follows.
Theorem 2.
The optimal powercontrol policy over devices and fading states to problem (P1) are given as
where can be obtained via a bisection search based on (32) under .
Remark 2 (Regularized Channel Inversion Power Control).
The optimal power control in Theorem 2 are observed to follow an interesting regularized channel inversion structure, with denoising factors and dual variables acting as parameters for regularization. More specifically, it is observed that for any device , if holds, the average power constraint of device must be tight at the optimality due to the complementary slackness condition (i.e., ), and thus this device should use up its transmit power budget based on the regularized channel inversion power control over fading states; otherwise, with , device should transmit with channelinversion power control without using up its power budget.
Furthermore, the following theorem shows that under Rayleigh fading channels, the average power constraints at all devices should be tight. Intuitively, this is due to the fact that in this case, deep channel fading may occur over time, and thus sufficiently large transmit power is required for implementing the regularized channel inversion power control.
Theorem 3 (Rayleigh Fading).
Under Rayleigh fading, where the channel coefficient ’s are independent circularly symmetric complex Gaussian
(CSCG) random variables with zero mean and variance of
, i.e., , , it must hold at the optimal solution to problem (P1) that . In other words, the average power constraints must be tight for all the devices.Proof:
See Appendix F. ∎
IvB Special Case with Only One Device Being Power Limited
To get more insights, we consider a special device case, in which only one device, say device , is subject to the average power constraint and all the other devices have sufficiently high power budgets (i.e., ). Then, problem (P1) reduces to
(33) 
First, as each device has unconstrained transmit power, it is evident that under any given , the optimal power control has the following channel inversion structure:
(34) 
By substituting the optimal , , problem (P3) boils down to the optimization over ’s and ’s, i.e.,
(35)  
Next, for problem (35), we first optimize under any given , in which the optimal is given as
(36) 
By substituting into problem (35), the remaining optimization over corresponds to the following convex optimization problem:
(37)  
The optimal solution to problem (37) is obtained by applying the KarushKuhnTucker (KKT) conditions, as given in the following.
Lemma 5.
Proof:
See Appendix G. ∎
Combining (34), (36), and Lemma 5, the optimal solution to problem (P3) is summarized in the following theorem.
Theorem 4.
For the special case with device being power limited, the optimal powercontrol policy to problem (P3) are given as
The optimal denoising factors are
From Theorem 4, it is first evident that we have and when the channel gain of device is no larger than the threshold , i.e., . Furthermore, by taking the firstorder derivative of w.r.t. , it is observed that the optimal power control is first increasing in and then decreasing in . Substituting into , it is also observed that is first decreasing in and then increasing in .
Remark 3 (ChannelInversion WaterFilling Power Control).
It is observed from Theorem 4 that the optimal power control of the powerlimited device follows a waterfilling type structure: there exists a water level (i.e., ), such that when the channel gain is larger than this level, it transmits with positive power; otherwise, it keeps silent without transmission. Furthermore, once transmitting, the transmit power is inversely proportional to the channel gain. Therefore, we name this solution as channelinversion waterfilling power control. The specialty of the solution lies in the nonmonotonic relationship between the transmit power and the channel gain. Specifically, as the channel gain decreases till , the powerlimited device should increase the transmit power for the purpose of compensating the channel fading to enforce signalmagnitude alignment. When the channel gain continues to decrease from to , the powerlimited device should reduce as the channel is too noisy and thus costly to be compensated. When the channel gain is smaller than , device keeps silent to abandon the channel in deep fade to save energy for other fading states. This is in sharp contrast to the conventional waterfilling power control for throughput maximization in pointtopoint fading channels, with monotonically nondecreasing power control w.r.t. the channel gains. The derived optimal structure is verified by simulation results shown in Fig. 3 under typical system settings, where the transmit power of powerlimited device is plotted under varying channel gains (see the solid curve).