Massive multiple-input multiple-output (MIMO) is slated to be a key enabling technology for 5G [1, 2], in particular in order to meet the ambitious goal of a thousandfold increase in data traffic . For the enhanced mobile broadband use case, where the required increase in capacity is induced by relatively few users executing demanding applications, the ability of massive MIMO systems to serve many users simultaneously with the same frequency resources is a clear advantage. However, in the massive machine-type communication case, it becomes impossible even for a massive MIMO base station to serve all devices at once, leading to new resource allocation problems in how to assign the limited number of available pilot signals to devices . This is particularly challenging in cases where the devices are heterogeneous, with differing traffic volumes and quality of service requirements.
For massive machine-type communications, arising for example in Internet of Things (IoT) use cases, where the number of users per cell is larger than the number of pilot signals, it is not possible for each device to be assigned a pilot. If traffic demands are sporadic, then random access protocols may be the the most appropriate access method, and some work has been conducted on designing such protocols for massive MIMO capable of resolving pilot collisions [5, 6, 7]. However, these protocols suffer from the usual inefficiencies present in any random access protocol; in particular the above protocols require that devices transmit multiple copies of data in order to resolve collisions through successive interference cancellation, resulting in less efficient use of resources. For high offered load scenarios and/or those with less variable traffic demands — especially in the most extreme case of periodic traffic — scheduled access may be more appropriate.
Regardless of the access protocol used, there is a need for new models based on the physical layer technology of massive MIMO, but appropriate for investigating higher-layer performance. In this paper we develop such a model, based on mixed-integer programming, and apply it to the case of end device scheduling for maximal throughput, with heterogeneous traffic demands (different data volumes to transmit and different data rates for different users). Here, we optimize not only the allocation of pilot resources to devices, but also transmission power control on both the uplink and downlink, and investigate whether it is possible to avoid the need for complex power control schemes through efficient scheduling of end device transmissions. We consider two widely-used precoding schemes: maximum ratio combining (MRC) and zero forcing (ZF). These can be seen as the two corner cases for precoding: one where no regard is paid to minimizing mutual interference but rather only maximizing receiver power levels for all uses, and one where mutual interference is completely nulled out (in the ideal case), respectively. In practice, a compromise such as minimum mean-square error or regularized zero-forcing is used, but it is still of interest to investigate the two extreme cases. We also consider three different methods for power control. We have performed systematic numerical studies to investigate the performance of our models, in terms of solution time, achievable throughput, and energy usage.
The contributions of this paper are the following:
We present new optimization models suitable for massive MIMO systems that allow for heterogeneous users and traffic, as well as a greater number of users than available pilots, as is typical of IoT scenarios. In particular, we provide formulations based on compatible sets, previously used for optimization of wireless mesh networks, based on the massive MIMO physical layer.
We develop an efficient solution approach based on column generation, capable of handling complex problems with non-compact formulations and integer decision variables. Both our models and solution approach are general and can be applied to a wide range of different objectives.
We provide formulations for two widely used precoding schemes (maximum ratio combining and zero forcing), as well as three power control schemes. These are the existing max-min fair SINR power control scheme, a new version of this scheme taking into account scheduling, and fully optimised power control. Further, we investigate a variant of fair power control performed only on the downlink.
We apply our models and solution approach to the problem of joint scheduling and power control for throughput maximization for a large number of heterogeneous devices (more than the available pilots). We conduct an extensive numerical study to investigate and compare the performance of the different precoding and power control schemes, as well as of the optimization itself. In doing so, we identify which cases are most suitable for different scheduling and power control methods.
We create an efficient heuristic that provides similar performance to full optimization, while substantially reducing the solution time of the main integer programming problem.
We show that for IoT scenarios, where each device does not aim for maximum throughput but rather has its own traffic and QoS parameters, the existing power control methods are much worse for energy efficiency than when the transmission power is optimized taking into account each device’s traffic demand and desired rate. Further, we show that joint optimization of power control and scheduling greatly improves throughput in cases where the channel quality of devices is unbalanced, with a small group of devices with good channels, and a larger group with poor channels.
We show that it is possible to achieve the same performance in terms of throughput without power control on the uplink by scheduling devices efficiently. This saves significant signalling overhead and complexity on end devices, which is important for resource-constrained IoT devices.
The rest of this paper is organized as follows. Section II describes related work in this area. Section III elaborates our targeted scenario and system models. Section IV then details our optimization problems and solution approach. A discussion of the complexity and efficiency of our approach follows in Section V. In Section VI we present our numerical study, along with results and discussion. In Section VII we discuss future work, and, finally, in Section VIII we conclude this paper.
Ii Related Work
Massive MIMO, first proposed in , refers to multiple-antenna deployments in which the number of antennas at the base station is significantly higher than the number of user antennas. This allows us to exploit favorable propagation, that is, that the channel responses from each antenna at the base station to the different user terminals are sufficiently different to allow separation of the users’ data streams by digital pre- and post-processing. In the presence of favorable propagation and a large number of antennas, an effect known as channel hardening  arises, and a radical increase in spectral efficiency is possible . Channel hardening means that each channel will be close to its expected value and channel variation is negligible in both the time and frequency domains. The increase in capacity gained from using massive MIMO thus comes as a direct result of considerably increasing the number of antennas and benefiting from the statistical advantages derived from the resulting large number of different channels to each user.
Channel models for massive MIMO can be divided into two categories: correlation-based models and geometry-based models [11, 12]. Geometry-based models [13, 14] can be used for the performance evaluation of practical systems, while theoretical performance analysis of these systems often relies on correlation-based models. The latter type includes correlation channel models [15, 16], where the correlation between antennas is considered, and mutual coupling channel models , where the coupling between each pair of antennas is also taken into account. Perhaps the mostly widely used models for performance analysis are however independent and identically distributed Rayleigh fading models [8, 18, 19]. Here, small-scale fading of the channel is modeled by i.i.d. Gaussian variables. This makes the analysis more tractable but nonetheless yields models that are sufficiently powerful — despite their lack of realism — to demonstrate important results for massive MIMO systems, including the aforementioned channel hardening property , as well as the effective SINR expressions we will use in this paper.
While the above models include small-scale fading of the channel in individual coherence blocks, the effective SINR characterizes the expected performance over a larger number of blocks, and has been the subject of research since the early days of massive MIMO [20, 15, 21, 22, 23]. In our work, we are primarily interested in performance of massive MIMO systems over a larger time scale, and as such we take the effective SINR as the basis for our models. In particular, the effective SINR for the uplink in single cell systems was derived in , while for the downlink it was analyzed in . Although effective SINR results are also available for multi-cell massive MIMO systems, we do not consider these in this paper.
In our models, we also take into account transmission power control for both the uplink and downlink. Most existing work on power control for massive MIMO systems considers homogeneous users. Max-min fair power control 
, in which the minimum SINR amongst the users is maximized, is a commonly adopted scheme. Although not true of max-min optimization in general, in the case of massive MIMO power control, this results in a common SINR value for all users. Maximum (fair) SINR is however not the only objective considered in previous work. Power control for mitigation of pilot contamination, in which reuse of pilot signals (usually in different cells) results in impaired channel estimation, has been studied in, and energy efficiency has been considered in . In , joint power control is performed between cells, and here users are not homogeneous, but rather each user has an individual SINR target. However, different traffic demands for users are not considered. In our work here, we combine power control with user scheduling, in which each user may have a defined individual SINR target and demand, that is, the amount of data the user wishes to send.
We achieve the above by applying optimization, specifically mixed-integer programming. Optimization methods have already seen use for various purposes within the area of massive MIMO systems. In , weighted-sum mean-square error minimization and Rayleigh quotient methods were used to determine pilot signals that reduce pilot contamination and improve channel estimation. For the case where the number of pilots is equal to the number of users, semi-definite programming and convex optimization were used to find optimal solutions. User scheduling in the case where the number of users exceeds the number of pilots was not considered, however. In , the optimal number of antennas, number of active users, and transmission power were optimized for maximal energy efficiency. The focus was on studying the performance of massive MIMO systems, and closed-form expressions were obtained for the above. However, in order to do so, each parameter was considered one at a time, while the other two were held constant. We instead aim to develop methods for user scheduling and power control for specific scenarios, and consider joint optimization of the two, since the transmission power affects which users are able to transmit or receive simultaneously.
Joint power optimization and user association in multi-cell systems was studied in 
, where each user was associated to a set of base stations that would then serve that user. The models provided consist of efficient linear programming formulations, however all users are always able to associate to at least one base station, that is, there is no limit placed on the number of available pilot signals. This means that user groups are static — only a single association was performed, rather than dynamic scheduling — resulting in a smaller solution space. and since user selection is not required, no integer variables are needed. For the scenario we consider here, with many more users than pilot signals, mixed-integer programming formulations are needed, and we use column generation to deal with the large number of possible combinations of simultaneously transmitting users.
Fair scheduling in multi-cell systems was investigated in 
with asymptotic analysis using large random matrix theory and convex optimization. However, no solution was provided for actually scheduling the users, but rather the achievable fair rate was analyzed. Further, the analysis relies on assumptions that we do not require in our work, namely that the ratio of antennas to users is kept constant, and that users are scheduled in co-located groups. Joint antenna and user selection has also been studied for massive MIMO, however using only brute-force search, which has very high computation complexity and is thus impractical for all but very small problem instances, and a greedy algorithm , which is more efficient but provides only suboptimal solutions in general. Finally, user grouping and scheduling has also been studied in frequency-division duplex (FDD) massive MIMO systems , however users were placed in pre-beamforming groups, and only users in the same group could be scheduled together. These groups were formed using clustering algorithms that do not provide optimal solutions, and user scheduling was performed based on an SINR approximation that considers only a single user at a time. Moreover, time-division duplex (TDD) provides better performance for massive MIMO than FDD , and as such is a more suitable candidate for real implementations. In our work we consider a TDD-based system.
Our optimization models presented in this paper are based on the notion of compatible sets (c-sets), which were first introduced in  for transmission scheduling in wireless mesh networks. Wireless mesh networks, like massive MIMO systems but unlike previous generations of cellular systems, allow multiple, simultaneous, possibly interfering transmissions. A compatible set is then a set of simultaneous transmissions that are able to be successfully decoded at the receivers despite this possible interference, that is, where the SINR is sufficiently high at all receivers.
Since their introduction, compatible sets have seen many applications in optimization of wireless mesh networks. Extensions for multiple modulation and coding schemes (MCSs) and power control were given in  and . C-sets have been applied to joint link rate assignment and transmission scheduling , link scheduling , routing and scheduling for throughput optimization , and multicast routing and scheduling . While the above were primarily focused on throughput maximization, work using c-sets has also considered fairness [37, 42], delay minimization [43, 44] and energy efficiency .
The notion of compatible sets thus provides a general and flexible method for modeling and solving optimization problems for wireless systems. However, existing models for c-sets are based on nodes equipped with omnidirectional antennas where interference depends primarily on the distance between receivers and transmitters. In massive MIMO systems, interference in simultaneous transmissions has different causes, for example imperfect channel estimation, and so these models cannot be applied to massive MIMO in their current form. In this paper, we develop a new type of compatible set model suitable for massive MIMO systems. We apply it to the specific case of user scheduling and power control, however the model we present here is general and can be readily used for other types of objective functions.
Iii System Models
In the following, we use the channel models and effective SINR expressions derived in , Chapter 3, based on the work in  and . The notation used is summarized in Table I. We consider a scenario with a single-cell massive MIMO system. There is one base station with an antenna array with antennas, and there is a set of single-antenna end devices in the cell, with the number of devices . We specifically consider the case where is larger than the number of devices that can be spatially multiplexed by the base station. This will typically be the case in IoT scenarios, especially for machine-type communication. The number of devices that can be served simultaneously is limited by the number of available pilot signals, as well as the required SINR for the devices being served. Here, we will not address mobility of the end devices, so each end device has a fixed location.
|number of antennas in the base station’s antenna array|
|set of end devices|
|number of end devices,|
|distance of device from the base station|
|duration of coherence block in seconds|
|bandwidth of coherence block in Hertz|
|number of samples in each coherence block|
|number of pilot samples in each coherence block|
|number of available pilots in each coherence block|
|length of each pilot, in samples|
|number of uplink data samples in each coherence block|
|number of downlink data samples in each coherence block|
|large-scale effects coefficient of end device|
|reference distance in meters|
|path loss exponent|
|mean-square channel estimate for end device|
|device ’s uplink demand in coherence blocks,|
|device ’s downlink demand in coherence blocks,|
|SINR threshold for successful reception for device|
|family of all compatible sets|
|family of compatible sets in which end device is a transmitter|
|family of compatible sets in which end device is a receiver|
|set of devices that transmit in compatible set|
|set of devices that receive in compatible set|
|number of coherence blocks assigned to compatible set .|
|dual decision variable associated with end device|
|whether or not end device is active in a generated compatible set.|
|uplink power control coefficient for end device|
|downlink power control coefficient for end device|
|the set of all, and all non-negative, integers, respectively|
|the set of all, and all non-negative, real numbers, respectively|
Iii-a Channel Model
The channel for the massive MIMO cell can be divided into coherence blocks,
where each coherence block is of duration s and bandwidth Hz. This gives
samples, taken at intervals of seconds, in each coherence
block (see , Section 2.1.3 for further details
— note that samples as used here are not equivalent to OFDM samples). Of these
samples, are used for pilot transmission, for uplink
data transmission, and for downlink data transmission. We thus
have . In each coherence block,
channel state information (CSI) is obtained for each scheduled end device by the
device sending a pilot signal of length samples. Using the example scheme
for orthogonal pilots given in , Section 3.1.1,
we thus have a total of orthogonal pilot signals, but in
general we have pilots. 111In the ideal case, for a large number of
users , we would expect to have , since is the maximum number
of mutually orthogonal vectors of length
is the maximum number of mutually orthogonal vectors of length, and will thus also give the maximum number of simultaneous users that can be allocated pilots and served by the base station. However, in some cases the number of pilots used may be smaller. We may for example have mobile users that require more frequent pilots than static users (due to their channels having a shorter coherence time), or we may assign multiple pilots within each coherence block to the same user in order to improve the quality of the CSI obtained. is thus the maximum number of devices for which CSI can be obtained in each coherence block. The pilot length and number of pilots are independent of the total number of end devices that may be associated with the base station, however, if, as in the scenarios we consider, the number of end devices is greater than , user scheduling is required to assign users to coherence blocks, with no more than users active in each block.
Each end device has a coefficient describing the large scale effects on the device’s channel. We further denote the uplink SNR by and the downlink SNR by . These two parameters are as defined in , Section 2.1.8, that is, they can be interpreted as SNRs when the median of is 1.0, but in general scales with . For each end device, we also have the mean-square channel estimate, . With perfect CSI, we have , otherwise is given by
for all .
On the uplink, each device transmits with a power control coefficient , , where a power control coefficient of indicates the device does not transmit at all, while indicates the device transmits with full power. On the downlink, the power control coefficient indicates the power that the base station allocates to transmission to device . The sum of the downlink power control coefficients gives the total normalized power with which the base station transmits, and so the must sum to at most , indicating full transmission power from the base station.
We will consider two precoding schemes for the base station: maximum ratio combining and zero forcing. In maximum ratio combining, we seek to maximize the power of each device signal at that device (downlink), or when recovering the received signal (uplink). For zero forcing, we instead seek to produce nulls in the channel at devices other than the relevant one, thus creating zero interference between the device signals if we have perfect CSI. However, there is still interference in the case of imperfect CSI. The effective SINR for the uplink and downlink for each of the two precoding schemes is shown in Table II. For a more complete discussion of the precoding schemes, as well as the derivation of these expressions, see , Chapter 3.
Note that the expressions given in the table apply when all devices are active simultaneously, and there are sufficient pilots for each device to be assigned one. For fewer active devices, the expressions need to be adjusted accordingly, as we will in our optimization formulations in Section IV. Specifically, the summations in the denominators should be taken only over the set of active users, rather than all , and the term that appears in the numerator of the zero forcing expressions should instead become , where is the number of active users. In this paper, we do not consider pilot reuse — that is, multiple end devices using the same pilot in the same coherence block — nor the resulting pilot contamination.
|Maximum Ratio Combining||Zero Forcing|
Iii-B Traffic Model
We now seek to schedule the traffic demands of the devices in as few coherence blocks as possible. For each device , we define as the device’s uplink demand, and as the device’s downlink demand. The demands are expressed as a number of coherence blocks, that is, to satisfy its uplink demand, device must transmit during the entire uplink phase (of channel uses) in coherence blocks, and similarly for the downlink case. A device may transmit in both the uplink and downlink phases of a given coherence block, but a device may not transmit in either phase if it has not been allocated a pilot during that coherence block. In order to transmit successfully, a device must also have an SINR in the relevant transmission phase larger than or equal to its SINR threshold, .
Different devices may have different SINR thresholds; this means that the traffic volume for devices in absolute terms (i.e. traffic volume in bytes) may not be equal, even when devices have equal demands, as they may transmit at different rates (achieved using different MCSs) during their assigned coherence blocks. More precisely, if a device has bytes of data to transmit, and chooses an SINR threshold allowing for a data rate of bytes per coherence block, then ’s traffic demand in coherence blocks is given by . In this way, devices can have both individual data volumes and QoS requirements (specifically data rates).
The device demands could represent either the current traffic queued at the devices, or recurring demands induced by periodic traffic as is typical in sensor network monitoring scenarios. In the former case, information about the device demands needs to be updated regularly at the base station, which places more stringent constraints on the time needed to schedule the devices’ transmissions. With recurring demands, on the other hand, a schedule can be established once and then used for a long time, making longer scheduling times more feasible. Varying demands also incur a signaling overhead in order to transmit information about device demands to the base station. However, this overhead can be quite small, for example a single value representing the current queue length, or may even be avoided by predicting traffic demands at the base station. This is particularly feasible if the device traffic is not delay sensitive, as it allows longer scheduling windows and/or the possibility to not serve a device demand fully in a given scheduling window in case of traffic prediction error.
Iii-C Device Scheduling
We define a frame as the set of coherence blocks required to meet all device demands in the current scheduling window. We then seek to minimize the frame, that is, find the least number of coherence blocks required for all devices to send and receive all their traffic. Minimizing the frame thus maximizes the throughput for contiguous frames, or, alternatively, maximizes the sleep time of devices if there is a delay (sleep cycle) between successive frames. Minimizing the frame also facilitates network slicing, an important component of the 5G architecture , by freeing more resources to be used by other slices. The frame minimization problem consists of allocating devices to coherence blocks in which they will transmit.
A set of devices that can successfully transmit and/or receive together during a coherence block form a compatible set. In each c-set, each device may take the role of a transmitter, a receiver, or both. If a node is a transmitter, it transmits data during the uplink phase of the coherence block; if a node is a receiver, it receives data during the downlink phase of the coherence block. If a node has both roles, it is active in both phases. Each node in the c-set must have a pilot assigned to it in order to transmit or receive during a given coherence block, and as such, the number of devices in a c-set is bounded from above by , regardless of the achievable SINRs when transmitting or receiving simultaneously.
This definition of a c-set differs to that used in wireless mesh networks (see Section II) in three ways. Firstly, the expressions used here for the SINRs are different, as they are derived from massive MIMO channel models and precoding schemes, rather than the models for single omnidirectional antennas typically used in mesh networks. Secondly, the number of available pilots limits the number of nodes that may be placed in the same c-set. No more nodes may be added to a c-set once all pilots are assigned, even if all nodes’ SINR conditions would be met. For a given number of pilots, this reduces the number of possible c-set solutions, thus reducing the complexity of c-set generation, which we will apply in order to solve the frame minimization problem. Finally, in mesh network c-sets, a node cannot be both a transmitter and receiver, whereas this is possible in massive MIMO c-sets; this is a consequence of the massive MIMO coherence block structure, which exploits channel reciprocity (in the TDD case) to use uplink pilots for both uplink and downlink channel estimation.
The family of all possible c-sets is denoted by . The family of c-sets in which device is a transmitter is denoted by , and the family of c-sets in which device is a receiver is denoted by . Note that and are not necessarily disjoint. For a given c-set , denotes the set of nodes that transmit in the uplink phase in coherence blocks allocated to , and denotes the set of nodes that transmit in the downlink phase in coherence blocks allocated to . Again, and are in general not disjoint.
Iv Frame Minimization Problem and Solution Approach
In this section we will formulate the main problem of this paper and describe a suitable approach for its optimization.
Iv-a Problem Formulation
The optimization problem studied in this paper — frame minimization — is formulated as the following integer programming (IP) problem. The problem will be referred to as the main problem and denoted by MP/IP.
Here, is the family of all c-sets and are (integer) decision variables indicating the number of coherence blocks in which the set of active devices (and their roles) is given by c-set . The objective (1a) then seeks to minimize the total number of coherence blocks used. Constraints (1b) ensure, for each device , that the number of coherence blocks in which the device will be active during the uplink phase of the block will be sufficient to meet the device’s uplink demand. Constraints (1c) are similar, but for the downlink demands.
Observe that in formulation (1), as in the remainder of this paper, we follow the notational convention that variables are indexed by subscripts (like index in variable ), and parameters (constants) by round brackets (like index in parameter ). This convention, used for example in , makes it easier to distinguish variables from parameters in problem formulations.
Iv-B Solution Approach
In the solution approach presented below we will consider restricted versions of MP/IP, denoted by MP/IP(), where only c-sets from a predefined subfamily of the family (of all c-sets) are allowed. In such a case the list of variables in (1) is restricted to , and the families are restricted accordingly. Below, the linear relaxation of MP/IP() (where the decision variables are continuous rather than integer) will be denoted by MP/LR(). Note that with this notation, MP/IP() denotes the main problem MP/IP (i.e., formulation (1)), and MP/LR() its linear relaxation. The latter formulation will be simply referred to as MP/LR.
Since the total number of c-sets (and hence the number of variables ) grows exponentially with the number of devices and pilots, the MP/IP formulation (and, for that matter, the MP/LR() formulation also) becomes non-compact. Hence, it is not feasible in general to solve the frame minimization problem (1) using all possible c-sets. Instead, we apply column generation .
The approach is to start (in Phase 1) by solving the linear relaxation MP/LR() using column generation — in our context column generation is referred to as c-set generation [37, 41] since in formulation (1) columns, i.e., variables, correspond to c-sets — and then (in Phase 2) solve MP/IP() restricted to the family of the c-sets resulting from Phase 1.
Iv-B1 Phase 1. Solving the linear relaxation of the main problem by c-set generation
Consider the linear relaxation MP/LR() for a given subfamily (list) of the family of all c-sets , i.e., the following linear programming problem formulation:
The above problem will be called the primal problem.
Now, using the (dual) variables specified in square brackets to the left of constraints (2b) and (2c), we form the dual to the (linear programming) primal problem (2) to give the following linear programming problem formulation [50, 51]:
(Recall that () denotes the set of devices that transmit (receive) in compatible set ; see Table I.) This dual problem will be denoted by DP() and referred to as the master problem in the column generation algorithm formulated below.
Consider a dual optimal solution , and suppose there exists a c-set, , say, outside the current list with . When is added to the list () then the new dual has one more constraint (3b) that corresponds to , and this particular constraint is broken by the current optimal solution . This means that the new dual polytope (for the updated c-set list ) determined by conditions (3b)-(3c) is a proper subset of the previous dual polytope, as the current optimal dual solution is cut off by the new dual constraint. Therefore, in the updated dual the maximum of (3a) cannot be greater than the previous maximum, and in fact in most cases it will be decreased. Thus, taking into account that the maximum of the dual problem is equal to the minimum of the primal problem (this is a general property of convex problems, and thus linear programming problems like MP/LR, called the strong duality property in optimization theory [50, 51]), adding the c-set will usually decrease the frame length. On the other hand, if no such new c-set exists, the final dual (and primal) solution thus obtained is optimal even if were to be extended to the list of all possible c-sets. The reason for this is that fulfills constraint (3b) for all .
The problem of finding a c-set that maximizes the quantity
over the family (i.e, the family of all c-sets) will be called the pricing problem and denoted by PP(). The pricing problem, a crucial element of the column generation algorithm, will be discussed in detail in Section IV-C.
The above observations lead to the following iterative c-set generation algorithm that solves the linear relaxation of the main problem MP/IP.
CG (c-set generation) algorithm
- Step 1:
Define initial and let .
- Step 2:
Solve the master problem DP() defined in (3); let be the resulting optimal dual solution.
- Step 3:
Solve the pricing problem PP(); let be a c-set that maximizes the quantity defined in (4) over .
- Step 4:
If , then and go to Step 2.
- Step 5:
; solve the primal problem MP/LR() defined in (2), and stop.
When the CG algorithm stops, the primal solution calculated in Step 5 solves the linear relaxation, MP/LR(), of the main problem MP/IP. Certainly, for the CG algorithm to start properly, the predefined initial list of c-sets, , appearing in Step 1 must assure feasibility of MP/LR(). Observe also that the pricing problem in Step 3 generates each new c-set (if any), whose constraint (3b) is maximally broken by . More information on the CG algorithm will be given in Section V-A.
Iv-B2 Phase 2. Solving the main problem
After generating the c-set list , the main problem (1) restricted to (denoted by MP/IP()), is solved by the branch-and-bound (B&B) algorithm . Note that the integer solution, , obtained thereby may in general be suboptimal, since there can exist c-sets that are not necessary to solve the linear relaxation (2) but are required for achieving the optimum of the MP/IP (where all c-sets are considered). We will return to this issue in Section V-C.
Iv-C Pricing Problems
The form of the pricing problem will depend on the precoding method used by the base station. Below, we give pricing problems for both maximum ratio combining and zero forcing. Both of these problems require variable multiplications (so-called bi-linearities) that need to be resolved by adding auxiliary variables and constraints. We however omit these in the following. In the appendix, we describe the necessary auxiliary variables and accompanying constraints to render the pricing problems as proper mixed-integer programming formulations. Also, discussion of computational efficiency issues of the pricing problems will be deferred to Section V-B.
Iv-C1 Maximum Ratio Combining (MRC)
The pricing problem for MRC can be formulated as follows:
where the decision variables (called pilot variables) and (power control variables) are listed in (5f) and (5g), respectively. The constant appearing in constraints (5b)-(5c) is defined as , for and . Note that in the above formulation the constraint on the power control variables is not explicit and involves a set of allowable vectors . This set, a parameter of the pricing problem, will be defined in Section IV-D for selected power control schemes. Note also the quantities and (optimal values of the dual variables) appearing in objective (5a) are basic parameters of the pricing problem.
The binary decision variable will be equal to 1 if, and only if, end device is to be included in the new c-set. Similarly, and indicate whether end device is to be a transmitter and/or receiver, respectively, in the new c-set. The objective (5a) selects a new c-set that will maximally violate the corresponding constraint (3b) in the dual problem. Constraints (5b) and (5c) ensure the SINR thresholds are met for each device for the uplink and downlink phases, respectively, and are based on the expressions given in Table II.
Constraints (5d) ensure that a device is included in the c-set if it is set as a transmitter or receiver, and that the device is not included in the c-set if it is inactive in both the uplink and downlink phases. Finally, constraint (5e) ensures that the c-set does not contain a greater number of devices than there are pilots available to assign to them in a given coherence block. Clearly, if is an optimal vector , then the optimal c-set that solves the pricing problem is determined as follows: , with and .
Iv-C2 Zero Forcing (ZF)
The pricing problem for ZF is similar to that for MRC, but instead uses the zero forcing effective SINR expressions.
Integer variables and are introduced and set to the number of active nodes in the uplink and downlink phases, respectively, via constraints (6d) and (6e). This is needed for the SINR expressions in constraints (6b) and (6c), in order to take the number of active users rather than the total number of users on the left hand side.
Iv-D Adding Power Control Optimization to Pricing
The pricing problems given above are incomplete since the constraints on the power control coefficients and are not explicitly defined. There are a number of different possible power control schemes that may be used. Here, we will consider three of them to compare, where each of these schemes may optionally include power control on the uplink or not. In all the considered cases power control optimization is achieved by specifying the set (appearing in constraints (5g) and (6j)) by means of explicit constraints on variables .
Iv-D1 Joint Power and C-Set Composition Optimization
The first scheme is to jointly optimize the power control coefficients and the c-set composition. This requires the constraints
to be incorporated into both the pricing problems (5) (for MRC) and (6) (for ZF) instead of (5g) and (6j), respectively. The two constraints simply ensure that the devices and the base station, respectively, do not exceed their maximum transmission power.
Iv-D2 Fair Power Control Optimization
The second power control scheme is (max-min) fair power control, as defined in , Section 5.3. In this scheme, nodes adjust their transmission power so as to maximize the minimum SINR of any node. It can be proven that this always results in all nodes achieving a single, common SINR — see , Section 5.3.1. Nodes adjust their power according to their relative channel quality, such that the node(s) with the worst channel(s) will transmit at full power, while nodes with good channels reduce their transmission power. For this scheme, in the case of MRC we need to replace (5g) with the following constraints.
where . Constraints (8b)–(8e) select the lowest of any node active in the uplink phase to be the numerator in constraint (8a), with constraint (8b) providing an upper bound, and constraints (8c)–(8e) providing a lower bound by selecting one active device, if there are any. If no devices are active in the uplink phase, will be zero. Constraints (8f) are taken from the fair power control expressions derived in , Section 5.3.1.
Iv-D3 Static Power Control Optimization
Since the above power control schemes require changing the power control coefficients each time a new c-set becomes active, they add a significant signaling overhead to inform the end devices of the values of the power control coefficients they should use. At the base station, new power control coefficients must be calculated for each coherence block, or, alternatively, stored for each c-set. To alleviate these problems, we may instead use a third scheme, which we will call static power control, in which the variables and are removed from the power control constraints by replacing each of them with , thus calculating power control coefficients over the entire set of devices. The and then become constant parameters instead of decision variables.
It is also possible to not perform any power control on the uplink. This is simpler to implement on resource-constrained end devices. In this case, all are set to 1 for each of the above schemes. However, the power control constraints for the downlink must still be determined using one of the other methods. Without power control on the uplink, near-far effects must instead be mitigated by the assignment of devices to c-sets.
V Complexity and Efficiency of the Solution Approach
The solution approach presented in Section IV is, by the very nature of the frame minimization problem MP/IP, quite complicated. Although we cannot definitely say that MP/IP is not polynomial, we may expect that its linear relaxation MP/LR() is already -hard. Reasons for this include that the linear relaxation in question is non-compact (it has an exponential number of variables); that the pricing problems required in the CG algorithm (formulated in Section IV-C) are very similar to the -hard maximum clique problem ; and also that analogous pricing problems, for generating c-sets in wireless mesh networks, are -hard (see [53, 41]). Thus, assuming that the pricing problems used in this paper are also -hard, we can expect, by polynomial equivalence of separation and optimization [54, 55], polynomiality of MP/LR() to be unlikely. Bearing this is mind, finding a compact formulation for MP/LR() is also unlikely since compact linear problems are solvable in polynomial time .
In the following we will discuss the efficiency issues related to our proposed approach and comment on its heuristic aspects. The discussion of this section will be further illustrated by the results presented in Section VI-A5.
V-a Solving the Linear Relaxation
As already mentioned, the predefined initial list of c-sets, , appearing in Step 1 of the CG algorithm must assure feasibility of MP/LR(), which means that each must be contained in at least one c-set in and in at least one c-set , unless ’s uplink or downlink demand (respectively) is zero. An example of such a is the family composed of all c-sets in which only a single device transmits and receives in the coherence block; this is always a valid c-set since there will be no interference and only one pilot signal is needed for the single device. Clearly, the minimal frame length for this particular is equal to .
The final solution of the column-generation algorithm is always optimal for any initial c-set list . Thus, the choice of a particular affects only the number of iterations of the CG algorithm, but usually to a small extent. It turns out that the CG algorithm achieves a solution close to the optimum of MP/LR() very quickly (in fewer than 20 iterations in our study, see Section VI-A1), no matter what the initial list is. In effect, most of the iterations are spent on decreasing such a quickly found suboptimal solution to eventually reach the optimum which is not significantly smaller. Hence, the choice of a particular initial c-set list is not really important. This phenomenon is in fact typical for non-compact linear programming problems solved by column generation; an illustrative example can be found in .
The CG algorithm generates the columns of the primal problem (corresponding to the c-sets) using the dual (3) as the master problem in the main loop. This means that in every iteration a new constraint (3b) corresponding to the c-set found by the pricing problem is added to the master. Observe that this is equivalent to classical column generation using the simplex method . If the primal problem MP/LR() is solved by the simplex algorithm, then when a new column (variable ) is added to (2) and this column replaces one of the current basic variables, the local rate of decrease in the value of the primal objective (2a) will be equal to (in the simplex method this value is called the reduced cost of variable ), and thus maximal over all c-sets. After column enters the basis, the value of (2a) will be decreased by , where is the value assigned to variable by the simplex pivot operation. If the current basic solution happens to be degenerate then may have to stay at the zero value, and in effect the objective function will not be decreased. Nevertheless, adding variable to the problem is necessary for the simplex algorithm to proceed towards the optimal vertex solution. Certainly, primal problem (2) could equally be used for the master problem (instead of the dual) since optimal dual variables , the parameters of the pricing problem, can be straightforwardly calculated from the optimal primal simplex basis .
It should be noted that the column generation method (and hence the CG algorithm), just like the simplex algorithm, does not guarantee a polynomial number of iterations to reach the optimum (what is guaranteed is a finite number of iterations). Fortunately, in practical applications (like ours) the simplex algorithm is efficient and requires a polynomial number of steps (typically proportional to the number of variables), and so does column generation. In any case, however, there is virtually no alternative to column generation when it comes to solving non-compact linear programs.
As already mentioned in Section IV-B1, when the CG algorithm terminates, the final c-set list, , will contain all c-sets necessary to solve the full linear relaxation of the main problem. This means that any optimal solution of MP/LR() is optimal for MP/LR(), i.e., the linear relaxation of the main problem MP/IP formulated in (1). Moreover, in the final simplex solution of MP/LR() the optimal vertex (optimal basic solution of the standard form of the primal problem) will contain at most non-zero values in . This observation implies that the gap between the optimal solution of MP/IP and its lower bound computed from MP/LR() is not greater than because the vector is a feasible solution of the main problem MP/IP. In fact, in practice this gap can be considerably smaller than because the actual number of non-zero elements in is equal to minus the number of non-binding constraints in (2b)-(2c) for . Moreover, if we assume that the fractional parts of are random then the average gap will be equal to (where is the number of non-zero elements in vector ). This means, that the quality (strength) of the linear relaxation MP/LR() will be good when the optimal value of the objective function (2a) is large compared with , which is the case when the demands and/or are large. Application scenarios where this would occur include monitoring or factory control networks, where traffic consists of regular updates with fixed sizes, so that traffic can be reliably predicted over a long period of time. Fortunately, it happens that even if this is not the case, the quality of the lower bound is very good, as illustrated in Section VI-A5.
V-B Solving the Pricing Problems
The pricing problems formulated in Section IV-C and used in successive iterations of the CG algorithm are mixed-integer programming problems with the number of binary decision variables proportional to (including additional variables to eliminate the bi-linearities, see the appendix). This is a reasonable number and, as it turns out, treating the pricing problems directly with the CPLEX mixed-integer programming solver, which applies the branch-and-bound algorithm, is sufficiently efficient compared to the remaining elements (i.e., the main problem, and the master problem in the CG algorithm) of the considered two-phase approach. Thus, we did not attempt to apply additional integer programming means, like introducing extra cuts to strengthen the linear relaxations of the pricing problems or even using the branch-and-cut (B&C) version of the B&B algorithm (see ) to improve the PP solution process efficiency.
V-C Solving the Main Problem
Solving the main problem (1) for a given (limited) list of c-sets requires a branch-and-bound algorithm . Such a B&B algorithm generates a rooted binary tree (called the B&B tree) composed of nodes (called the B&B nodes). Each such B&B node corresponds to a particular linear programming subproblem obtained from the linear relaxation MP/LR() by restricting the range of variables through the B&B node-specific lower and upper bounds. More precisely, extra constraints of the form are added to MP/LR() in the subproblem of the B&B node , where are the lower and upper bounds on , respectively, specific to B&B node . The algorithm starts with visiting the root of the B&B tree, where and for all . In general, when the algorithm visits a B&B node , it solves the subproblem of , selects a fractional in the resulting optimal solution , and (provided there is a fractional component in ) creates two new B&B nodes and that are called active nodes. In , the constraint is substituted by , and in by . Note that if vector is integer, a feasible integer solution (i.e., a feasible solution of MP/IP()) is achieved and the new B&B nodes are not created. Nor are new B&B nodes created when the optimal objective of the subproblem of (i.e., ) is greater than or equal to the current best feasible integer solution. After visiting a B&B node the algorithm proceeds to one of the other active B&B nodes and becomes inactive. There are several reasonable ways of visiting the active nodes, among them depth-first search (of the B&B tree).
In fact, to assure true optimality, problem MP/IP should be solved using the branch-and-price (B&P) algorithm  instead of the price-and-branch (P&B) algorithm that underlies the two phase approach described in Section IV. The basic difference between B&P and P&B is that in the latter the CG algorithm is invoked only once, at the root node of the B&B tree, and then the linear subproblem solved at each of the subsequent B&B nodes assumes the subfamily computed at the root. B&P in turn, would apply the CG algorithm at each B&B node. This would consume excessive overall computational time already for medium size MIMO systems because of executing the CG algorithm at every B&B node.
With P&B the linear subproblems solved at the B&B nodes (obtained, as explained above, by adding constraints on the range of (continuous) variables to MP/LR()), are solved quickly by the CPLEX linear programming solver, provided the number of c-sets in is reasonable (this is the case in the numerical examples considered in Section VI). Moreover, the limited number of decision variables and good quality of the linear relaxations (tight lower bounds on the objective value of the corresponding MP/IP()) make the P&B solution process sufficiently efficient for our purposes. Here we could also allow decreasing the family used for solving MP/IP by letting the CG algorithm stop after relatively few iterations when the slope of the gain in the optimal value of the master is becoming flat and the current master solution is close to its final, optimal, value (see Section VI-A1).
It is important that the optimal solutions obtained with our P&B two-phase approach, i.e., optimal solutions of MP/IP(), are close to the lower bound on MP/IP (obtained with MP/LR()), as shown in Section VI-A5. It is also worth noticing that good quality solutions of the main problem (1) can be obtained by solving MP/IP(), where , is the family of c-sets obtained in Phase 1, and is the final optimal solution of Phase 1, i.e., of MP/LR(). This observation is illustrated in Section VI-A5. As already mentioned, the number of the c-sets in family is not greater than and can be much smaller than the number of the c-sets in family . Hence, the number of (integer) variables in MP/IP() can be much smaller than in MP/IP().
To end this section we note that it is possible to write down a (compact) mixed-integer problem formulation for MP/IP with the number of variables and constraint polynomial in
and the maximum frame length. In such a formulation, the c-sets for the consecutive coherence blocks of the optimized frame are specified explicitly by means of additional binary variables and constraints (for each coherence block) in the way used in the pricing problems. Such a formulation could be solved directly, using a mixed-integer programming solver, for example the solver available in CPLEX. However, the formulation in question would involve a number of binary variables (and constraints) that is far beyond the reach of IP solvers. Therefore, the c-set generation algorithm (where the pricing problem for finding the improving c-set in each iteration of the CG algorithm is used only once per iteration) applied in the proposed two-phase approach seems to be the only reasonable option for reaching good quality (near optimal) solutions of the main problem MP/IP.
Vi Numerical Study
For our numerical study, formulations (1)–(9) were implemented in AMPL and experiments carried out using the CPLEX solver on an Intel Core i7-3770K CPU (3.5 GHz) with 8 virtual cores (4 cores with 2 threads each), and 8 GB RAM. The parameters used for our experiments are shown in Table III, and are based on the LuMaMi massive MIMO testbed , which has 100 antenna elements and uses a coherence block structure similar to that used in LTE, that is, 12 OFDM subcarriers and 7 symbols per coherence block. In practice, a coherence block can be larger, depending on the environment and device mobility, however these parameters ensure coherence across time and frequency during the block for practical scenarios of interest. The first symbol is used for pilots, resulting in 12 available pilots in each coherence block. We set the pilot length to 1, giving the minimum channel estimation quality and thus the most challenging case for device scheduling.
In general, the large scale channel effects may include both path loss and large scale fading caused by, for example, shadowing from objects. (We assume that channel hardening has eliminated small scale effects on the channel .) However, for this study we derive the values of only from path loss based on the distance of each user from the base station. Since our optimization models rely only on the actual value of , placing nodes at a greater distance from the base station, such that they have greater path loss, is entirely equivalent to nodes placed closer to the base station but subject to shadowing or other large scale effects that reduce the channel gain. The path loss may be modeled as a function of the distance of each device to the base station, for some path loss exponent and reference distance . Theoretically, the choice of reference distance is immaterial. However, large differences in magnitude between the different parameters can lead to floating point calculation errors while solving the optimization problems, so we choose a reference distance that keeps the values used in our scenarios within reasonable ranges.
For simplicity, we model the location of the base station as a single point, that is, each end device has the same distance to all antenna elements. The SNR and path loss exponent used are based on typical values for outdoor, non line-of-sight transmission. For these experiments, we gave all devices the same SINR threshold, and so in the following we will without loss of generality refer to the threshold as simply .
|Uplink SNR||10 dB|
|Downlink SNR||10 dB|
|Number of antennas||100|
|SINR threshold||1.0 (0 dB)|
|Number of pilots||12|
|Path loss exponent||3.7|
|Reference distance||200 m|
|Number of users||40|
|1||50 m||200 m|
|2||200 m||400 m|
|3||50 m||100 m|
|4||50 m||100 m||
|5||50 m||100 m||: 4…40, step 4|
|6||50 m||500 m||: 40 total: 8 near, 32 far|
We conducted six different experiments, according to the configurations shown in Table IV. Parameters not mentioned in the table are as shown in Table III. Experiments 1–3 explore different near and far distances for the device groups, while Experiment 4 tests the effect of the SINR threshold, and Experiment 5 varies the number of devices. Experiment 6 tests an unbalanced scenario in which there are a small number of nodes with good channel conditions (close to the base station), and a larger group of nodes with much poorer channels (far from the base station).
We defined six different scenarios for the experiments, shown in Table V. In each scenario, the nodes are divided into two groups, a near group and a far group, with nodes in each group, except for Experiment 6, where there are nodes in the near group, and nodes in the far group. The distances of the near and far groups from the base station are different for different experiments, but in each case the near group is closer. There are two traffic demand levels, high demand (10 coherence blocks) and low demand (2 coherence blocks). The scenarios provide different combinations of low and high uplink and downlink demands for nodes close to and far from the base station. The distances of the devices are important for power control, since close devices will have a stronger signal than far devices.
|Group||Uplink demand||Downlink demand|
|Group||Uplink demand||Downlink demand|
|Group||Uplink demand||Downlink demand|
|Group||Uplink demand||Downlink demand|
|Group||Uplink demand||Downlink demand|
|Group||Uplink demand||Downlink demand|
We tested all six scenarios for each experiment configuration For all experiments, we found the minimal frame size for each of the power control schemes (optimal, fair, and static). We also tested optimal power control with no uplink power control, which we will hereafter call the downlink power control scheme. For downlink power control, all uplink power control coefficients were set to 1.0 (0 dB). Both maximum ratio combining and zero forcing were evaluated. Our experiments generated a large amount of data, although in many cases the results were as would be expected and/or were similar across the different scenarios and experiments. Because of this, in the following sections, we give only a summary of key experimental results. The full results are available in . In our results, the power values obtained are not the actual energy used by the nodes, but rather represent sums of the power control coefficients used in the coherence blocks in the frame. Total power is thus the sum of all the power control coefficients (both uplink and downlink) for all nodes active in each allocated block. Similarly, max node power is the sum of the power coefficients in all blocks in which a node was active, for the node with the highest such sum, that is, the total power for the node with the highest total power. This gives an initial comparison of the energy efficiency of the different power control schemes. For a more comprehensive analysis, an energy model would need to be applied to the allocated transmissions of the nodes and the base station.
We use a number of different performance metrics to evaluate our experimental results. For reference, these are summarized in Table VI.
|Frame size||The objective of our optimization functions: the number of coherence blocks required to satisfy all end devices’ traffic demands. Smaller frame sizes yield higher throughput.|
|Total power||The sum of the transmission power control coefficients for all end devices, for all coherence blocks in the frame.|
|Max node power||The sum of the transmission power control coefficients, for all coherence blocks in the frame, for the end device that had the highest such sum.|
|Solution time||The time, in seconds, required to solve each optimization problem.|
|Total solution time||The sum of the solution times of all optimization problems solved to reach the final solution.|
Vi-A1 Experiments 1–3
First, we will discuss the results from Experiments 1–3. All the results shown in the figures in this section are from Experiment 1, however, the results for Experiments 2 and 3 were similar and are omitted from further discussion.
The minimum frame sizes obtained were 21 coherence blocks for scenarios 1 and 2, and 35 blocks for scenarios 3–6. In some isolated cases, most often when using static power control, the minimum frame size was greater by one coherence block. This can be caused by more restrictive power control schemes being unable to accommodate the c-sets needed for the smaller frame size, although in one case this even occurred for optimal power control. This is a result of the different c-sets generated when solving the pricing problems. As discussed in Sections IV-B2 and V-C, the c-sets needed to optimally solve the main problem may differ from those needed to optimally solve its linear relaxation. In some cases, this can result in (slightly) suboptimal final solutions.
Based on the scenarios tested, smaller frame sizes result from situations where the device groups are well separated in terms of demand, that is, where the near and far nodes do not compete — all high demand nodes are at the same distance to the base station and thus have similar channels. Whether high demand nodes compete on the uplink or downlink does not appear to affect the frame size in these cases.
All power control schemes gave similar performance in terms of frame size, including downlink power control. This means that we can achieve similar throughput performance without uplink power control, if we schedule the nodes effectively. This can reduce signaling overhead as well as simplify the implementation of resource-constrained IoT end devices. Transmitting at full power on the uplink will of course consume more energy, however this may be mitigated by efficient scheduling resulting in longer sleep times between transmissions. A full investigation of energy efficiency is however beyond the scope of this paper.
Nonetheless, the total power, shown in Figure 0(a), gives an indication of the comparative energy usage for the different configurations tested. As would be expected, the closer the devices are to the base station, the lower the power needed. Performing full optimal power control can result in significant energy savings; note that the power values shown in the figures for optimal power control are not zero, but rather very small relative to the other values. Fair power control gives the worst energy efficiency, even higher than static power control. This is because this power control scheme was designed to give the maximum fair SINR to all devices. Static power control, since it is fair power control performed over all devices, not just those in the current c-set, can only result in a lower or equal SINR, and uses less power for nodes with good channel quality. Both of these schemes in effect over-estimate the transmission power needed, as they are not designed with limited traffic demands in mind, but rather for saturated scenarios where all devices seek to maximize their throughput. The maximum node power results, not shown here, followed a similar pattern.
The number of pricing problem iterations in the CG algorithm required to generate all needed c-sets was less than 100 in most cases, with notable exceptions being fair, downlink, and optimal power control in scenarios 1 and 2, where more than 500 iterations were needed. Figure 0(b) shows the total solution time, broken down into the time needed for (all iterations of) the pricing problem, and the time needed for the main problem, i.e., the final MP/IP(). As would be expected, static power control takes the least amount of time, as here the power control coefficients are not decision variables to be optimized. Optimal power control also performs quite well, in most cases solving faster than fair power control. This is because for optimal power control, the power control constraints are linear, whereas fair power control requires additional integer variables. Scenarios 1 and 2, which had the smallest frames, required a much higher number of pricing problem iterations for all power control schemes except static power control.
In most cases, the main problem constitutes a large proportion of the solution time, but in a few cases the pricing problem instead takes significant time. However, the objective becomes close to its final, optimal, value after relatively few iterations: fewer than 20 iterations for the frame size to drop below 50 from initial values of between 200 and 400. This indicates that a good, albeit suboptimal, list of c-sets can be achieved by only performing a small number of iterations, which would dramatically reduce the time spent solving the pricing problem.
Vi-A2 Experiment 4
In Experiment 4, we varied the SINR threshold . This can equivalently be interpreted as worsening the channel quality of all the nodes, for example by moving them further away from the base station. This allows us to investigate how the frame size, power, and solution times are affected by solving the optimization problem under more challenging channel conditions. The results presented are from scenario 1, however the overall behavior observed in the other scenarios as the SINR threshold was increased was similar.
Figure 1(a) shows the frame size vs. SINR for all power control and precoding schemes tested. From the figure, we can see that the power control scheme makes little difference to the frame size over different SINR threshold values, however the behavior for the two precoding schemes is very different. With maximum ratio combining, the frame size increases monotonically with increasing SINR threshold. We can observe regions where the frame size is held steady, before jumping up to its next value; this is because the SINR threshold reaches a critical point where one or more devices are no longer able to be accommodated in the same c-set, resulting in a different scheduling configuration with a larger frame.
However, zero forcing is able to maintain a steady frame size across all SINR threshold values, and even for higher threshold values tested in experiments not reported here. This is because the effectiveness of zero forcing, unlike MRC, does not depend on the relative quality of the devices’ channels, but rather on the accuracy of the channel estimation, which is not affected by increasing the SINR threshold. Zero forcing is thus able to maintain simultaneous transmission using the c-sets needed for the optimum frame size, essentially up until the point where the SINR threshold is so high that transmission is not possible to the devices at all. In terms of total and max node power, we again observed the same dichotomy between the two precoding schemes, with zero forcing maintaining steady performance across all SINR threshold values tested, while for MRC the power increases with increasing SINR threshold.
Figure 1(b) shows the total solution time as the SINR threshold is varied. Here we see a dramatic spike in solution times for intermediate SINR threshold values, especially for MRC with fair and optimal power control, but even with the other power control schemes. For ZF, the solution times are more consistent, but do vary with the power control scheme. The increase in solution time for MRC is likely due to the larger number of possible solutions in this region that need to be eliminated by the solver during branch-and-bound. In the intermediate SINR threshold region, there are many candidate c-sets where some or all nodes would achieve SINR close to the threshold value for some or all power coefficient values. Meanwhile, at high SINR thresholds, there are few c-sets that can satisfy the SINR constraints; in fact, at very high SINR thresholds, only singleton c-sets are possible, where only one device is served at a time. In the low SINR threshold region, most or all nodes are easily able to be accommodated in the same c-set. Increased solution times in the intermediate SINR region occurred for both the main and pricing problems. However, these longer solution times were not due to an increased number of iterations of the pricing problem. Rather, the individual iterations took longer to solve.
These results indicate that, particularly for MRC, optimal solutions may be most suitable for cases where devices have either quite high or quite poor channel quality, whereas for intermediate cases, other approaches may be more suitable. Such approaches could include the variations on our solution algorithm discussed in Section V-C, or other approximation algorithms. The development of such algorithms will be the subject of our future work.
Vi-A3 Experiment 5
In Experiment 5, we varied the number of devices. Again, the results presented here concern scenario 1, but similar behavior was observed for the other scenarios. Figure 3 shows the frame size as the number of devices increases. Below a certain point, more devices can be added without increasing the minimum frame size, however after that the frame size increases with the number of devices. This is unsurprising, since with a fixed SINR threshold the viable c-sets and thus frame size are largely determined by the number of available pilots.
In terms of the power used, the total power increased steadily with the number of devices, while the max node power did not show any clear trend in relation to the number of devices. The total solution time increased exponentially with the number of devices, and this was the case for the time for both the pricing problem and the main problem. This behavior is typical of this kind of optimization problem. The number of pricing problem iterations also increased with the number of devices, however here there was more variation.
Vi-A4 Experiment 6
In Experiment 6, since the near and far groups are unbalanced, with four times as many nodes in the near group as in the far group, the frame sizes (Figure 3(a) were more varied between the different scenarios than in the other experiments. Here, the frame size was primarily determined by the total downlink demand, so that in scenarios where the larger (far) group had higher downlink demands, the frame size was also larger. This is because in this experiment, the far group had a very poor channel, making it more difficult for the base station to share its power across the different simultaneous devices in such a way as to achieve an acceptable SINR for all devices.
This is particularly evident when looking at the frame sizes when using static power control. With this power control scheme, the base station effectively allocates some of its transmission power to the nearby nodes, even when they are not scheduled to receive any data in a given coherence block. The remaining power allocated to the far nodes is then not enough to achieve a good SINR, preventing those nodes from being scheduled simultaneously, and resulting in much larger frame sizes. With power control optimization, however, the base station tailors its power allocation to the nodes that are actually active in each block, and is thus able to provide more power (and therefore a higher SINR) to the far nodes when they are scheduled on their own, without any close nodes active. The long frame sizes in these cases also resulted in higher total and max node power. For such a scenario, then, where the majority of nodes have poor channels, and especially when there are large downlink demands, transmission power control optimization becomes very important to ensure the efficient use of resources.
The solution times for Experiment 6 (Figure 3(b)) were mostly short, except for Scenario 1. In this case, the near group had higher demands than the far group. As discussed above, scheduling even one node from the near group at the same time as nodes from the far group significantly affects the transmission power allocation. This results in a more difficult scheduling problem when the majority of the traffic to be scheduled belongs to the near group.
Vi-A5 Heuristic Methods
Table VII shows the performance of the different phases of our solution approach for Experiment 6, Scenario 1, as well as the performance when solving the main problem on a selected family of c-sets consisting of only those used in the optimal solution to the linear relaxation (rightmost column). From the table, we can see that the lower bound provided by the linear relaxation is very good, deviating by only two coherence blocks at most from the final IP solution of MP/IP(). Our proposed heuristic, using only c-sets in , also achieved objective values very close to those using the entire generated family , while substantially reducing the time required to solve the main problem. These times are short, both for the heuristic and the full MP/IP(), thanks to the high quality lower bound produced by the linear relaxation. However, as seen in the results from Experiment 5, the times will grow exponentially as the size of the network grows. As the solution times increase, a timeout can be used to attempt solution of the main problem for family , and fall back to using only selected c-sets, i.e., family , if the main problem cannot be solved in a reasonable time.
Somewhat surprisingly, most of the experiments we conducted did not give any substantial differences in frame size between either the different power control schemes, or the different precoding schemes. However, the other performance measures investigated, namely power and solution time, do vary greatly, which indicates there are benefits to choosing one scheme over another depending on the specific network configuration. In the case of a large group of nodes with a poor channel, optimizing transmission power did show substantial benefits in frame size, with static power control performing much worse than the other power control schemes. This is a fairly typical, realistic scenario, especially in situations such as urban environments where there is significant variation in channel gains resulting in a long tail distribution of channel gain among the devices. Our results here show a substantial improvement in throughput by jointly optimizing power control along with device scheduling.
Although a full energy model is needed to obtain concrete energy values, using optimal power control provided clear benefits in terms of energy savings. This is true for both the total power of the whole network, as well as the maximum individual power for any device node. Zero forcing with optimal power control consistently used the least power, two orders of magnitude lower than the others in most cases for Experiment 4, and three orders of magnitude in the case of Experiment 5. However, our experiments are insufficient to provide comprehensive guidance on which precoding scheme to use, since some important aspects of their performance, as well as intermediate schemes such as minimum mean-square error and regularized zero-forcing, are not considered here. For example, some research has shown that MRC outperforms ZF for multi-cell systems, depending on the number of devices and antennas, and the pilot reuse factor .
The fair power control schemes widely adopted in the literature on massive MIMO (see Section II) are targeted towards use cases with homogeneous devices and traffic, where the primary goal is high throughput. Our results here however show clear benefits in tailoring power control for heterogeneous devices and traffic demands, especially where energy efficiency is of concern, as is usually the case for IoT scenarios, or where there are large discrepancies in channel qualities between devices. IoT devices are also often resource constrained, and our results show that power control on the uplink may be avoided by employing efficient scheduling without sacrificing throughput. This interaction between scheduling and power control is not taken into account in previous schemes, but rather the uplink and downlink are treated similarly.
Vii Future Work
There are many possible extensions to this work, both to further develop and validate our models, as well as to apply them to other problems and application scenarios. Here we have tested systematically constructed scenarios intended to illuminate how the performance of both the underlying massive MIMO system and our optimization formulations changes with different system parameters. An important step for future work is then to also test our formulations with real network traces and/or randomly generated network scenarios. The channel models could also be made more realistic, as those used here are relatively simple, with only path loss considered in our experiments, and only large scale effects taken into account in our model. In future work, channel models for multi-cell systems could also be used. This would significantly increase the complexity of the models, but would allow for performance of larger massive MIMO systems to be studied, as well as phenomena such as inter-cell cooperation and pilot contamination.
Energy usage is a critical performance measure for many IoT systems, and here we have not optimized for energy efficiency, but rather for overall system throughput. However, our model can also be applied to different objective functions, including minimal energy usage, fair energy usage, or delay minimization, which may be important for mission critical systems or tactile internet applications. In order to accommodate these new objectives, new versions of the main problem need to be formulated that model these performance measures, and from there the dual problems and thus pricing problem objectives can be derived. The pricing problem constraints we have formulated here, and which constitute the most difficult part of the formulations, will however still apply.
In our experiments we have investigated the solution times to find the optimal frame size. In some application scenarios, it would be feasible to use the optimal solutions, or a modified version of our solution algorithm where the time to solve is improved by either reducing the number of iterations, applying time limits, or using heuristic methods as discussed in Section V-C. This is especially true in the case of periodic traffic, where the optimization problem need only be solved once, and the solution can then be used for a long time. However, for cases where it is not feasible to apply our optimization formulations, there is a need to develop new, more efficient approximation algorithms.
In this paper we have developed a new model adapting the concept of compatible sets to massive MIMO systems, which allows for the efficient solution of a variety of types of optimization problems relating to network performance. We have applied our model to the case of joint device scheduling and power control for maximum throughput, considering two different precoding schemes and three power control schemes. Our results show substantial benefits in terms of energy usage when treating the power control coefficients as optimization variables, and large gains in throughput in jointly optimizing power control and device scheduling in scenarios with a large spread in channel qualities between devices. On the other hand, much simpler power control can also be applied without loss of throughput. In particular, the same throughput can be achieved without performing any power control on the uplink at all, which may reduce the complexity needed in resource-constrained IoT end devices, as well as reduce the signaling overhead between the base station and the devices.
With the advent of 5G driving the adoption of massive MIMO, there is a need for new modeling to analyze the performance of these systems, as well as provide practical solutions for implementation, for the broad range of different application scenarios and performance goals encompassed under the umbrella of the 5G requirements. In this work, we have targeted the case of a large number of devices with heterogeneous demands, and our models provide a flexible and general method for network optimization in such scenarios.
The work of Emma Fitzgerald and Fredrik Tufvesson was supported by the strategic research area ELLIIT. The work of Michał Pióro (and partly of Emma Fitzgerald) was supported by the National Science Centre, Poland, under the grant no. 2017/25/B/ST7/02313: “Packet routing and transmission scheduling optimization in multi-hop wireless networks with multicast traffic”.
In order to remove the variable multiplications in formulations (5), (6), (8), and (9), we need to introduce the following auxiliary variables and constraints. In formulation (5), we introduce auxiliary variables , for the uplink and downlink respectively. Constraints (5b) are then replaced with the following constraints:
Variable thus represents the product . For a valid power control scheme, should always be at most 1, however, in fair power control, this does not necessarily hold in cases where : a node can be assigned a power control coefficient greater than 1 when the node is not included in the c-set. This is because the coefficients are defined for all nodes (as all nodes can potentially be included in a c-set), but their values are calculated relative to the node in the c-set with the worst channel, that is, the lowest value of such that . This means that nodes which have lower values of than any of the nodes in the c-set will have .
While this would have no affect on practical power control — nodes with do not transmit — it would cause some of the constraints in (10b) to be infeasible (no acceptable value can be found for ) without the final term in (10b), which simply cancels the lower bound on when . In such cases will be forced to zero.
Constraints (5c) are replaced with
Here, the extra term is not required as will always lie in the interval .