I Introduction
^{†}^{†}footnotetext: This work was supported by Basic Science Research Program through the National Research Foundation of Korea grants funded by the Ministry of Education [NRF2018R1D1A1B07040322, NRF2019R1A6A1A09031717]. The work of O. Simeone was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 725731). The work of S. Shamai was supported by the ERC under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 694630).Federated learning is an emerging distributed learning paradigm in which mobile devices collaboratively train a machine learning model while preserving the privacy of local data sets
[1]. In the presence of latency and bandwidth constraints, the implementation of federated learning on wireless systems is challenging if many workers, or devices, are involved. A potential solution to this problem is overtheair computation (AirComp), which leverages the superposition property of the multiple access channel (MAC) from worker devices to a parameter server (PS) to allow for simultaneous transmissions from multiple devices [2, 3, 4]. It was reported in [3] that AirComp outperforms a conventional multiple access technique in terms of test accuracy, and that the gain is particularly significant at low transmit power and large number of workers.AirComp assumes that the wireless channels from different devices can be controlled, e.g., via transmitterside phase compensation, in order to ensure coherent onair combining [5]. To alleviate this problem, the work [6] considered a deployment of intelligent reflecting surfaces (IRSs). IRSs, also referred to as reconfigurable intelligent surfaces, can be controlled through integrated electronics in order to shape their response to impinging electromagnetic waves [7]. This enables the modification of the propagation channel between nearby transceivers. As a result, IRSs are considered as a costeffective solution to improve spectral and energy efficiency of wireless systems [8, 9, 10, 11]. As examples of recent works on IRSs, references [9] and [10] addressed the joint design of downlink beamforming and IRSs’ phases for interference management in multiuser [9] and multicell systems [10]. Reference [8] analyzed the number of reflecting elements of IRSs needed to beat conventional wireless relaying techniques (see also [12]). Finally, an informationtheoretic study was provided in [13].
In this work, we study the advantages of deploying IRSs for AirComp systems. Unlike [6], which focused on a MAC channel where workers directly communicate with a PS, we consider the largescale cloud radio access network (CRAN) illustrated in Fig. 1, in which the workers upload local models to the PS through distributed access points (APs). The APs, or remote radio heads (RRHs), in CRAN send the received signals to the PS on fronthaul links. The fronthaul links have finite capacity, requiring fronthaul quantization and compression [14]. We tackle the problem of jointly optimizing the IRSs’ reflecting phases and a linear detector at the PS with the goal of minimizing the mean squared error (MSE) of a parameter estimated at the PS. Due to the nonconvexity of the problem, we propose an iterative algorithm that alternately updates the IRSs’ phases and the linear detector. Via numerical results, we validate the advantages of deploying IRSs with optimized phases for AirComp in CRAN systems.
Ii System Model
As illustrated in Fig. 1, we consider an overtheair computation task performed on a CRAN system. In the system, singleantenna worker devices send locally updated models to a PS through singleantenna APs. Each AP is connected to the PS via a fronthaul link, which we model as a digital link of capacity bit/sample [14]. We define the sets and for the workers’ and APs’ indices, respectively.
Iia OvertheAir Computation Model
We focus on the transmission at a specific time slot where each worker sends a scalar parameter , and the PS estimates a function of the transmitted parameters . The parameter
can be an element of the gradient vector
[3] or the local model [4] updated at worker using its local dataset. The PS typically estimates the weighted sum , with , where denotes the number of training samples at device [4]. To simplify the discussion, we assume for all , and that the target parameter denoted by is given by the sum(1) 
We also assume that the parameters are independent, and we define the power of parameter as . Thus, the target parameter has power .
IiB Channel Model
To assist edge communication from the workers to the APs, we assume the presence of IRSs [6] in the network. Each IRS has reflecting elements, whose reflecting phases are dynamically adjusted to adapt to the instantaneous channel state information (CSI). We define the set for the IRSs’ indices.
Under a flatfading channel model, the received signal of AP can be written as
(2) 
where is the signal transmitted by worker ; denotes the channel coefficient from worker to AP ; and represents the additive noise. The signal satisfies the transmit power constraint .
Due to the presence of IRSs, the channel coefficient is modelled as [8, 9, 10]
(3) 
where denotes the smallscale fading channel from worker to AP ; represents the smallscale fading channel vector from IRS to AP ; is the smallscale fading channel vector from worker to IRS ; denotes the pathloss of the direct link from worker to AP ; is the pathloss of the composite link from worker to AP through IRS ; and is a diagonal matrix that represents the reflecting operation of IRS , which is defined as
(4) 
where denotes the reflecting phase of the th element of IRS .
We model the pathloss between worker and AP as , where is the Euclidean distance in meter between the two input vectors, and denote the position vectors of worker and AP , respectively, is the pathloss exponent, and denotes the pathloss at the reference distance of m. For the pathloss of the composite channel from worker to AP through IRS , we adopt the sumdistance model [7] which models as
(5) 
where denotes the position vector of IRS .
Iii OvertheAir Computation in IRSaided CRAN
In this section, we illustrate the operations at the worker devices, the APs, and the PS in the IRSaided CRAN system described in Sec. II.
Iiia Transmission at Worker Devices
Without claim of optimality (see [15]), we assume that each worker uses the maximum transmit power , so that the transmit signal is given as
(6) 
with the coefficient . We note that this does not require CSI at worker devices.
IiiB Quantization at APs
AP sends a quantized version of the received signal to the PS through a fronthaul link of capacity bit/sample. Under the assumptions that the updated model vectors have a sufficiently large dimension, the quantized signal denoted by can be modelled as [14, 16]
(7) 
where models the quantization distortion as being independent of and distributed as . According to standard ratedistortion theoretic results [17], the quantization noise power satisfies the condition
(8) 
where
denotes the variance of the received signal
given as(9) 
The minimum distortion power that satisfies the condition (8) is given as
(10) 
Note that the optimal distortion level (10) is a function of the reflecting phases , since affects the channel coefficients as seen in (3).
IiiC Estimation at PS
Based on the received quantized signals , the PS estimates the target parameter in (1). To elaborate, let us define a vector which stacks the quantized signals. Then, the vector can be expressed as
(11) 
where we have defined the vectors , and with .
The channel vector from worker to all the APs can be written as a function of the IRSs’ phases as
(12) 
where the matrices , , and the vectors , are defined as , , , and , respectively. Note that the optimization of the phases of IRS is equivalent to that of the vector as long as the conditions
(13) 
are satisfied for all , where denotes the th element of . From the vector , each phase can be obtained as .
We assume that the PS performs a linear estimation of the target parameter from . Accordingly, an estimate of is given as
(14) 
with a linear detection vector .
For given phases , i.e., , and linear detection vector , the MSE between the estimate and the target parameter is evaluated as
(15)  
Iv Optimization
We tackle the problem of jointly optimizing the IRSs’ reflecting phases and the linear detection vector of the PS with the goal of minimizing the MSE in (15) while satisfying the unit modulus constraints (13). The problem can be stated as
(16a)  
s.t.  (16b) 
Since it is difficult to jointly optimize the variables and , we propose an iterative algorithm that alternately optimizes one variable while fixing other.
If we fix the IRSs’ phases in problem (16), finding the optimal detector becomes an unconstrained quadratic optimization problem, whose closedform solution is given as
(17) 
To tackle the problem of optimizing the IRSs’ phases for fixed , we remove the terms that are not dependent on the IRSs’ phases from the cost function. Stating the obtained problem with respect to a stacked vector with yields
(18a)  
s.t.  (18b) 
where we have defined the notations , , , , , and with and being the th element of and the
th column of an identity matrix of size
, respectively.The problem (18) is nonconvex due to the unit modulus constraints (18b). To handle this issue, we adopt the matrix lifting approach proposed in [6]. Accordingly, we tackle the problem (18) with respect to a matrix defined as
(19) 
The matrix is subject to the constraints , , and for all . From , the IRSs’ phase vector can be recovered as the first elements of the last column of .
We tackle (18) with respect to by using the following equalities:
(20)  
(21) 
Specifically, by substituting (20) and (21) into problem (18), we obtain the problem
(22a)  
s.t.  (22b)  
(22c) 
with the matrix defined as
(23)  
To address the nonconvexity of constraint (22c), we note that (22c) is equivalent to the constraint [6]
(24) 
where
denotes the largest singular value of the input matrix. Function
is convex in [18]. Furthermore, for , the lefthand side (LHS) of (24) is 0 when and it becomes larger than 0 otherwise.Based on this observation, as in [6], we tackle the problem
(25a)  
s.t.  (25b) 
with a fixed weight . In problem (25), we have removed the rank constraint (22c) and instead added a penalty term to the cost function that increases if (22c) is not satisfied.
The problem (25) is a differenceofconvex (DC) problem whose locally optimal solution can be efficiently found via the concave convex procedure (CCP) approach [19]. CCP solves a sequence of convex problems obtained by linearizing the terms that induce nonconvexity. In the DC problem (25), the only term that induces nonconvexity is in the penalty term. Linearizing at a reference point yields the upper bound [6]
(26) 
where
returns the eigenvector of the input matrix corresponding to the largest eigenvalue. The condition (
26) is satisfied with equality when . The CCP based algorithm for optimizing is summarized in Algorithm 1.1. Initialize as (19) with arbitrary that satisfies (18b), and set
2. Update as a solution of the convex problem:
s.t. 
3. Stop if is satisfied. Otherwise, go back to Step 2 with .
Overall, the proposed algorithm that alternately optimizes the IRSs’ phases and the linear detector is detailed in Algorithm 2. In the algorithm, we initialize and in Steps 12, and update for fixed in Steps 34. In Step 4, is modified only when it does not satisfy the modulus constraints (18b). In Step 5, is updated for fixed , and we check the convergence in Step 6.
V Numerical Results
In simulation, we assume that the positions of workers, APs and
IRSs are uniformly distributed in a circular area of radius 100 m. We set the variance of local parameters to
for and assume dB, in the pathloss models and for the penalty coefficient in (25a). For all links, we consider independent Rayleigh fading channels that are distributed as , and . We compare the performance of the proposed optimized scheme with two baseline schemes, one without IRSs and one with IRSs whose reflecting phases are randomly chosen. In all figures, we plot the normalized MSE, which is defined as the MSE normalized by so that it lies in the range .In Fig. 2, we plot the average normalized MSE versus the fronthaul capacity for an IRSaided CRAN system with , , , and dB. The figure shows that the proposed optimized scheme outperforms both baseline schemes without IRS and with random phases, and that the gain increases with the fronthaul capacity . This is because, when is small, the impact of carefully designing the IRSs’ phases becomes minor due to the impact of the quantization noise signals
. Also, the gain increases with the signaltonoise ratio (SNR)
of the uplink channel, and this trend coincides with the observation reported in [10, Sec. IV].Fig. 3 plots the average normalized MSE versus the number of APs for an IRSaided CRAN system with , , , and dB. When there are only a few APs, deploying IRSs provides relevant gains only when the reflecting phases are optimized according to Algorithm 2. However, the impact of optimizing the reflecting phases becomes minor for sufficiently large .
Vi Concluding Remarks
We have studied the impacts of deploying IRSs on AirComp in a CRAN system. To this end, we have tackled the joint optimization of the IRSs’ reflecting phases and the linear detector at the PS with the goal of minimizing the MSE of the parameter estimated at the PS. Numerical results were provided that investigate the effects of various parameters on the performance gain of the proposed optimization scheme compared to baseline schemes. Among open problems, we mention the design of channel estimation process, the investigation of the effect of imperfect CSI, and the design of AirComp jointly with information transfer.
References
 [1] K. Bonawitz and et al, ”Towards federated learning at scale: System design,” arXiv:1902.01046, Feb. 2019.
 [2] B. Nazer and M. Gastpar, ”Computation over multipleaccess channels,” IEEE Trans. Inf. Theory, vol. 53, no. 10, pp. 3498–3516, Oct. 2007.

[3]
M. M. Amiri and D. Gunduz, ”Machine learning at the wireless edge: Distributed stochastic gradient descent overtheair,” in
Proc. IEEE ISIT2019, Paris, France, Jul. 2019, pp. 1–5.  [4] K. Yang, T. Jiang, Y. Shi, and Z. Ding, ”Federated learning via overtheair computation,” IEEE Trans. Wireless Commun., vol. 19, no. 3, pp. 2022–2035, Mar. 2020.
 [5] T. Sery and K. Cohen, ”On analog gradient descent learning over multiple access fading channels,” arXiv:1908.07463, Aug. 2019.
 [6] T. Jiang and Y. Shi, ”Overtheair computation via intelligent reflecting surfaces,” in Proc. IEEE Globecom 2019, Waikoloa, USA, Dec. 2019.
 [7] E. Basar, M. D. Renzo, J. D. Rosny, M. Debbah, M. Alouini, and R. Zhang, ”Wireless communications through reconfigurable intelligent surfaces,” IEEE Access, vol. 7, pp. 116753–116773, 2019
 [8] E. Bjornson, O. Ozdogan, and E. G. Larsson, ”Intelligent reflecting surface vs. decodeandforward: How large surfaces are needed to beat relaying?,” arXiv:1906.03949, Jun. 2019.
 [9] Q. Wu and R. Zhang, ”Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,” IEEE Trans. WirelessCommun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019.
 [10] C. Pan, H. Ren, K. Wang, W. Xu, A. Nallanathan, M. Elkashlan, and L. Hanzos, ”Multicell MIMO communications relying on intelligent reflecting surface,” arXiv:1907.10864, Jul. 2019.
 [11] M. Di Renzo and et al, ”Smart radio environments empowered by reconfigurable ai metasurfaces: an idea whose time has come,” EURASIP J. Wireless Commun. Netw., pp. 1–20, May 2019.
 [12] K. Ntontin and et al, ”Reconfigurable intelligent surfaces vs. relaying: Differences, similarities, and performance comparison,” arXiv:1908.08747, Aug. 2019.
 [13] R. Karasik, O. Simeone, M. Di Renzo, and S. Shamai, “Beyond maxSNR: Joint encoding for reconfigurable intelligent surfaces,” arXiv:1911.09443, Nov. 2019.
 [14] S.H. Park, O. Simeone, O. Sahin, and S. Shamai, ”Fronthaul compression for cloud radio access networks: Signal processing advances inspired by network information theory,” IEEE Signal Process. Mag., vol. 31, no. 6, pp. 69–79, 2014.
 [15] X. Cao, G. Zhu, J. Xu, and K. Huang, ”Optimal power control for overtheair computation in fading channels,” arXiv:1906.06858, Jun. 2019.
 [16] R. Zamir and M. Feder, ”On lattice quantization noise,” IEEE Trans.Inf. Theory, vol. 42, no. 4, pp. 1152–1159, Jul. 1996.
 [17] A. E. Gamal and Y.H. Kim, Network Information Theory, Cambridge Univ. Press, 2011.
 [18] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge Univ. Press, 2004.
 [19] M. Tao, E. Chen, H. Zhou, and W. Yu, ”Contentcentric sparse multicast beamforming for cacheenabled cloud RAN,” IEEE Trans. Wireless Commun., vol. 15, no. 9, pp. 6118–6131, 2016.
Comments
There are no comments yet.