To improve the spectrum and energy efficiency of the forthcoming and future wireless networks (5G and beyond), network service providers (SPs) have investigated and deployed several innovation technologies such as massive MIMO and mmWave. However, the required high complexity and high hardware cost are still the main hindrance to their implementation in practice. Recently, intelligent reflecting surface (IRS) has emerged as a new and cost-effective solution for the SPs. An IRS is generally composed of a large number of passive elements, each of which is able to reflect the incident signal with an adjustable phase-shift. By intelligently tuning the phase-shifts of all elements adaptive to dynamic wireless channels, the signals reflected by the IRS can add constructively with non-reflected signals at the user receiver to boost the received signal power and enhance the data throughput at the user receiver. As such, IRS allows the SPs to improve the spectrum and energy efficiency, extend the network coverage and enhance the Quality of Serivce (QoS) of the users with low cost.
Apart from the above benefits, IRS enables the SPs to provide a new network resource and a new service to the mobile users. In particular, the SPs can provide IRS resources to the mobile users in addition to traditional network resources such as antenna, spectrum, and power. Indeed, there are some works that have investigated the IRS resource allocation issues. In particular, the authors in  consider the allocation of transmit power and IRS resources to mobile users. The network includes one base station (BS) and one IRS. The IRS is divided into modules of reflection elements, i.e., reflection modules. Then, the problem is to determine the number of reflection modules, the corresponding passive beamforming, and transmit power for the users to maximize the signal-to-interference-plus-noise ratio (SINR). Note that the authors consider the allocation of reflection modules, i.e., instead of all the reflection elements, to the users since triggering all the reflection elements frequently can result in the increased latency of adjusting phase-shift. To solve this problem, the parallel alternating direction method of multipliers (PADMM) algorithm is used. Different from , the authors in  assume that the BS and the IRS belong to different SPs. To maximize the individual utility of the BS and IRS, the Stackelberg game is proposed to jointly optimize the IRS resource price, the transmit power, and the passive beamforming of the triggered reflection modules. However, both the works in  and  consider scenarios with a single IRS.
In this letter, we consider an IRS-assisted wireless network with multiple SPs and multiple mobile users. The SPs deploy BSs along with IRSs and provide network services to the users. In particular, the SPs are responsible for allocating the transmit power and IRS resources to the users for their data transmissions. To satisfy different QoS requirements of the mobile users, similar to other works, e.g., , , we assume that each SP divides the transmit power and the IRS into different power levels and reflection modules, respectively. The SPs may set different prices for their resources, and the users that select different SPs and services may achieve different utilities. Due to their rationality, the users with low utilities have an incentive to adapt their SP and service selections. In other words, the users can dynamically change their SP and service selection strategies over time. To model the dynamic SP and service selection strategies of the users, we propose to use the evolutionary game . The reason for the use of the evolutionary game is that this game is able to deal with the problem of dynamic selection strategies of players, i.e., the users. In particular, the players in the evolutionary game are bounded rational, and they can adapt their strategies gradually to reach the evolutionary equilibrium. Furthermore, the algorithm to implement the strategy adaptation based on the evolutionary game has a low complexity that is suitable for the dynamic strategy selections of the users .
Our main contribution is as follows: we first formulate the SP and service selection in the network as an evolutionary game. In the game, the users form populations, and they adjust their SP and service selections based on their utilities. We then model the SP and service adaptation of the users as replicator dynamics in the evolutionary game and analyze the equilibrium of the evolutionary game. Finally, we provide performance evaluation to demonstrate the consistency with the analytical results and to validate the proposed game model.
Ii System Model
We consider a system model as shown in Fig. 1 that consists of a set of BSs and a set of single-antenna users. The BSs are assumed to belong to service providers (SPs), and each BS is equipped with antennas. Note that our model can be extended to a general case in which one SP deploys multiple BSs and IRSs. To avoid the intra-cell interference, each SP uses time-division multiplex access (TDMA) to deploy the IRS-enhanced communication service. Also, frequency-division multiplex access (FDMA) is used to avoid the inter-cell interference among the users belonging to different BSs. Let denote the bandwidth assigned to BS . To provide flexible services to the users, BS has a set of power levels, denoted by , that the users can select for their transmissions. We assume that , where is the maximum power of BS . SP deploys an IRS, i.e., denoted by IRS , to improve the QoS for the users that BS serves. IRS has reflection elements, and the IRS is divided into modules controlled by parallel switches. Each module in IRS consists of elements, and thus . One BS-IRS pair can serve multiple users, but the user is associated with one BS-IRS pair. Moreover, the user is allowed to select one or multiple modules, i.e., a subset of modules, of the selected IRS. In general, given a selected power level and bandwidth , the data throughput achieved by the user depends on the number of modules, i.e., instead of the orders/indexes of the modules, in the selected subset. Thus, SP has potential subsets of modules that the user can select, and subset , has modules. Denote as the phase-shift matrix corresponding to the subset that the user selects, i.e., subset of IRS . Then, is a diagonal matrix in which its main diagonal consists of phase-shifts of reflection elements of IRS . In particular, we have , where is the phase-shift of reflection element of subset of IRS , . With the assistance of subset of IRS , the signal received at each user is the sum of 1) the received signal via the direct link and 2) the received signal via the IRS-assisted link. Thus, the received signal at user when selecting subset of IRS and power level of BS is determined as follows:
where is the data symbol intended to user ,
is the beamforming vector associated withcontaining power level that the user selects, is the vector of channels of the direct link from BS to user , is the vector of channels of the link from subset of IRS to user , is the vector of channels from BS to subset of IRS , and is the complex additive white Gaussian noise at user , , where
is the variance. We assume that the channel state information (CSI) of all channels involved is perfectly known at BS
, i.e., based on the pilot signals. In addition, IRSs are typically deployed in static environments due to the challenging task of CSI estimation, we can also assume that the quasi-static flat-fading model or even static flat-fading model is applied for all channels
. The signal-to-noise ratio (SNR) of useris defined as the ratio of the power of signal targeted to user and the noise over bandwidth as follows:
Iii Game Formulation and Equilibrium Analysis
There are totally users, BSs, and IRSs in the network. SP offers a set of power levels and a set of subsets of IRS modules that the user can select. In particular, one subset consists of one or multiple modules of the IRS. Again, one module of IRS consists of elements. Without loss of generality, we assume that users are divided into groups. Each group, say group , consists of users that select IRS , subset , i.e., the corresponding phase-shift matrix , and power level . We can say that users in group select network service . Note that when the user selects IRS , it is only allowed to select subsets of modules of IRS and power levels of BS . We have , and each user in group selects IRS , , and
at a probability of. The phase-shift matrix of the IRS is typically optimized for a group of nearby users, and we assume that the channel statistics are identical for those users in the same group . Therefore, the users in the group can be represented by an index set , and the expected downlink transmission rate of the user in the group is
where , where refers to the beamforming vector associated with the users in the group that selects power level . Let denote the value of unit data to the user in group when selecting IRS , subset , and power level . Denote as the price per element in IRS and as the price per unit power. Prices and are set by SP that are constant. Since users in each group share the same resources, they should share the resource cost. Then, the utility of the user is given by
where the -norm is used to count the number of non-zero elements of a diagonal matrix that here refers to the number of active reflection elements of IRS that the user selects. The average utility of the user is
. We leverage the replicator dynamics to model the SP and service adaptation of the users. The replicator dynamic process of the users is expressed as a series of ordinary differential equations as follows:
where represents the first derivative of with respect to , and is the initial strategy of the users in group at . The factor is the learning rate of the users that evaluates the strategy adaptation frequency.
Please refer to  for the detailed proof. ∎
Here, we prove that and are continuous functions in the rectange . Indeed, it is clear that function is continuous at every . Moreover, due to the static flat-fading channel model, the channel vectors , , and are constant and thereby continuous at every . Therefore, is continuous at every , and functions , and are also continuous at every if . Since and , then and are continuous functions in the open rectangle . According to Theorem (7), problems in (6) and (III) converge to a unique solution. This is verified by simulations in the next section.
Note that to make the decision on SP and service selections, the users need information about the average utility, i.e., , and proportion of users choosing different strategies, i.e., , from the BSs. However, the up-to-date information may not be available at the users due to the communication latency. Therefore, at time , the users base on the information at time , i.e., delay for units of time, to make the SP and service selections. Thus, the delayed replicator dynamic process is
Note that as delay is large, the decisions of the users based on the outdated information tend to be inaccurate. As a result, the evolutionary game with the delayed replicator dynamics may not converge. How to determine such that the evolutionary game converges is challenging. As an example, consider a simple scenario with , and each SP offers one service including subset and power level :
The stability of the evolutionary equilibrium with the delayed replicator dynamics can be guaranteed if and only if
The detailed proof can be derived by following  which is omitted from here. ∎
Theorem (2) shows that the stability of the evolutionary equilibrium is guaranteed as the users use the information at for their decisions.
Iv Performance Evaluation
In this section, we present the numerical results to demonstrate the effectiveness of the proposed dynamic SP and service selection in the IRS-assisted wireless network. We consider a network with SPs, BSs, IRSs and users. The sizes of IRS 1 and IRS 2 are elements. SP 1 divides IRS 1 to modules, and SP 2 does not divide IRS 2. Each BS is equipped with antennas and offers power levels that the users can select: dBm, dBm, dBm, and dBm. As such, SP 1 offers 4 services, namely Services 1, 2, 3, and 4, and SP 2 offers 2 services, namely Services 1 and 2. All the channels suffer a Rayleigh fading, and the path loss at the reference distance m is dB. Due to the obstacles, the pass loss of the direct links from the BS to the users is much higher than those of the links between the BS and the IRS as well as the links between the IRS and the users, we can thus set , and . When the user selects a service of the SP, the BS uses the fixed point iteration algorithm  to optimize the beamforming vector and the phase-shift matrix of the IRS subset for the user.
First, it is important to verify the equilibrium convergence of the game scheme. Figure 2(a) shows the utilities of the users selecting different SPs and services versus time. As seen, the utilities of the users selecting different SPs and services vary until the evolutionary equilibrium is reached. Also, at the evolutionary equilibrium, the users achieve the same utility although they select different SPs and services. The reason is that the evolutionary equilibrium is reached only when the utilities of the users selecting any service provided by any SP are equal to the expected utilities of the users.
Note that the time to reach the evolutionary equilibrium can be different depending on the learning rate and the number of users as shown in Fig. 2(b). As seen, the evolutionary equilibrium is reached faster as the value of is higher. The reason is that the frequency of the strategy adaptation of the users is higher with the high value of . Moreover, for a given value of , more time is required to reach the evolutionary equilibrium as the number of users is higher.
Next, we discuss the impact of information delay on the proportions of users selecting different SPs and services. For the evaluation purpose, we consider the proportion of the users selecting Service 1 provided by SP 1 as shown in Fig. 3(a). As seen, when , there is a fluctuating dynamics of strategy adaptation. The fluctuation becomes larger as increases, and the service selection cannot reach the evolutionary equilibrium as . This is due to the fact that the outdated information makes the users’ decisions accurate.
Now, we investigate the impact of sizes of IRSs on the proportions of users selecting different SPs and services. In particular, we vary the size of IRS 2 provided by SP 2. Note that is also the size of one module for trading since SP 2 does not divide IRS 2 into different modules. As shown in Fig. 3(b), as the size of IRS 2 increases, the proportions of users selecting services provided by SP 2 increase. The reason is that as increases, the throughput and utility obtained by the users selecting services provided by SP 2 increase. However, as the size of IRS 2 is large, the increasing rate tends to be slower. This is because of that the users pay a very high resource cost as they select the services provided by SP 2. As a result, their utilities decrease, and they tend to select the services provided by SP 1.
We finally discuss how the mobility of the users impacts on the SP and service selection of the users. Figure 4 shows the proportions of users selecting different SPs as the distance between the users and IRS 1 varies for different IRS prices set by SP 1. As observed, for a given price, the proportion of users selecting SP 2 increases as the distance between the users and IRS 1 increases. The reason is that as the distance between the users and IRS 1 increases, the throughput obtained by the users if selecting services of SP 1 decreases. Thus, the users are willing to select services of SP 2. Note that as SP 1 increases its service price, the utilities of the users currently selecting services of SP 1 decrease, and thus the proportion of users selecting SP 1 decreases as shown in the figure.
In this letter, we have investigated the dynamic SP and service selection in an IRS-assisted wireless network. Specifically, we have formulated a joint SP and service selection problem as an evolutionary game. We have modeled the SP and service adaptation of the users as replicator dynamics and analyzed the equilibrium of the evolutionary game. We have provided performance evaluation to demonstrate the consistency with the analytical results and to validate the proposed game model.
-  (2019) Dynamic access point and service selection in backscatter-assisted rf-powered cognitive networks. IEEE Internet of Things Journal 6 (5), pp. 8270–8283. Cited by: §I, §III, §III.
-  (2020) A stackelberg game approach to resource allocation for intelligent reflecting surface aided communications. arXiv preprint arXiv:2003.06640. Cited by: §I, §I.
-  (2020) Reflection resource management for intelligent reflecting surface aided wireless networks. arXiv preprint arXiv:2002.00331. Cited by: §I, §I.
-  (Website) External Links: Cited by: §III, Theorem 1.
-  (2003) Evolutionary game dynamics. Bulletin of the American mathematical society 40 (4), pp. 479–519. Cited by: §I.
-  (2019) Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming. IEEE Transactions on Wireless Communications 18 (11), pp. 5394–5409. Cited by: §II.
-  (2019) MISO wireless communication systems via intelligent reflecting surfaces. In IEEE International Conference on Communications in China, pp. 735–740. Cited by: §IV.