I Introduction
Waterfilling solutions play a central role in the optimization of communication systems. They are undoubtfully among the most fundamental and important results in wireless communication designs, signal processing designs, and network optimizations including transceiver optimization, training optimization, resource allocation, and so on, e.g., [1, 2, 3, 5, 6, 7, 14, 13, 16, 8, 10, 11, 4, 12, 9, 17, 15]. Loosely speaking, optimal resource allocations for multidimensional communication systems usually lead to waterfilling solutions. Over the past decade, wireless systems have evolved dramatically and exhibited a great variety of configurations with many different performance requirements and physical constraints, e.g., [18]. This diversity results in a rich body of variants of waterfilling solutions [23, 16, 13, 11, 14, 20, 27, 19, 12, 10, 9, 18, 25, 21, 26, 22, 24, 15, 17], from single waterlevel ones to multiple waterlevel ones [12], from solutions for perfect channel state information (CSI) to robust ones such as cluster waterfilling [25], and from constant waterlevel ones to cavefilling ones [29, 27].
In most of existing work, the first step in obtaining a waterfilling solution for an optimization problem is to find the KarushKuhnTucker (KKT) conditions and manipulate them into a recognizable format which is usually referred to as the waterfilling solution. KKT conditions are necessary conditions for the optimization, and are also sufficient if the problem is convex [2]. While KKT conditions determine the optimal solutions, their initial formats are implicit and do not provide information in how to achieve the optimal solution values. Thus, sophisticated mathematical manipulations are needed to transform them into a waterfilling structure.
As communication systems and optimization problems get more complicated, the corresponding KKT conditions also become more complicated, both in mathematical complexity and in the number of equations. Manipulating the KKT conditions into a recognizable format may become very difficult. First, the large number of complicated KKT conditions hinder efficient manipulations and clear understanding of their physical meaning in terms of waterfilling structure. Furthermore, the derived waterfilling solutions may not have compact and systematic format to allow the development of waterfilling algorithms in an effective and unified manner.
Furthermore, the optimization design is not complete with the derived waterfilling solutions as the solutions contain unknown parameters such as water levels. In other words, the solutions are still in implicit form. Thus, an important second step in obtaining the waterfilling solution of an optimization problem is to find a practical algorithm. This step has not been given sufficient attention and in general has been ignored. Generally speaking, waterfilling solutions consist of two major components, i.e., water level and water bottom, and a traditional imagery of puring is to pour water over a pool with different bottoms [1]. Based on this analogy, several practical waterfilling algorithms have been proposed [32, 30, 33, 41, 36, 40, 31, 38, 39, 34, 28, 29, 35, 37]. They generally differ from each other from many perspectives, e.g., design logics, mathematical formulas, algorithm structures, computation complexity and so on. It is because waterfilling algorithms are usually designed for a specific optimization problem. A unified waterfilling design framework is proposed in [32] based an interesting geometric understanding of waterfilling operation for throughput maximization under various constraints. That work opens a door for unifiedly understanding various kinds of power constraints on throughput maximization waterfilling algorithm designs instead of casebycase discussions. This interesting and important work motivates us to investigate waterfilling algorithms from a unified viewpoint.
In this paper, we provide a new viewpoint on waterfilling solutions. It has three major advantages: 1) it helps the understanding of waterfilling results; 2) it avoids tedious and challenging manipulations of KKT conditions; and 3) it leads to efficient algorithms to find the solution values. Based on this new understanding, a unified waterfilling algorithm design framework is proposed from an algebraic viewpoint instead of the geometric viewpoint in [32]. Moreover, in our work the optimization problems with general objective function and general power constraints are taken into account. The main contributions are summarized as follows.

We provide a novel understanding from a dynamic perspective for optimization problems with waterfilling solutions. In contrast with the traditional approach, this viewpoint can avoid tedious manipulations of KKT conditions in deriving waterfilling solutions and greatly simplify waterfilling algorithm design.

A standard and plausible notation used in waterfilling solutions is the “+” operation where . Its widely acknowledged physical meaning is that the resource (e.g., power) allocated to a subchannel must be nonnegative. This physical meaning, while intuitive, should not be used as a conclusion in solution derivations or algorithm designs. In our work, index based operations are introduced in the algorithm designs to avoid the “+” operation and simplifies the algorithm design.

In addition to efficiency, the proposed method and the resulting algorithms are highly intuitive and understandable, and are also attractive from the implementation perspective. It is also compatible for extensions to complicated systems by using simple cases as building blocks.

With the proposed method, we investigate a class of communication optimizations with general convex objective functions under box constraints, where the allocated resource of each subchannel is bounded from both ends. Corresponding algorithms for the optimal solution values are proposed. Moreover, the algorithms can be extended to serve even more general problems and have a wide range of applications.

Robust optimizations for wireless systems with CSI uncertainties are also studied. Algorithms for finding the solutions are again proposed for robust weighted meansquarederror (MSE) minimization, robust capacity maximization, robust worstMSE minimization, and robust minimum capacity maximization for multipleinput multipleoutput (MIMO) orthogonal frequencydivision multiplexing (OFDM) systems, the last two of which were largely open.
The remainder of this paper is organized as follows. In Section II, our new viewpoint on waterfilling solutions is given, based on which an original algorithm is proposed to find the optimal solution. Following that, we investigate optimizations with box constraints in Section III. Then in Section IV, several extended optimizations are studied including problems with ascending sumconstraints, multiple water levels, and fairness considerations. Some numerical simulation results are given in Section V. Finally, the conclusions are drawn in Section VI.
Ii A New Viewpoint of Waterfilling Solutions
We consider a convex optimization problem of the following form:
(1) 
where and the functions are realvalued, increasing, and strictly concave. Further assume that ’s are continuous, where denotes the first order derivative of . Many optimization problems in wireless communications have this format or contain this problem as an essential part, for example, the power allocation problem in MIMO capacity maximization. It is known that the optimal solution of (II) has a waterfilling structure. In what follows, we first explain the tranditional treatment of this problem, then our new viewpoint and algorithm are elaborated along with the comparison of the two examples.
Iia Existing Treatment for WaterFilling Solutions
Traditionally, Lagrange multiplier method has been used for (II). The first step is to find the KKT conditions and from them to derive the waterfilling solution of the problem in a compact format. As the objective function is a sum of decomposed concave functions and the constraints are linear, the problem is a convex one. Thus the KKT conditions are both necessary and sufficient. With straightforward calculations, the KKT conditions of (II) are
(2) 
where and are the Lagrange multipliers corresponding to the two constraint sets. By rewriting the KKT conditions, the solution has the following waterfilling structure:
(3) 
where
(4) 
i.e., is the inverse function of . The Lagrange multiplier has the physical meaning of the water level. On the other hand, the function of ’s is implicit in this waterfilling solution as they affect the solution through the “+” operation. We would like to highlight that the “+” operation results from rigorous mathematical derivations. While it can be explained intuitively by “the power level must be nonnegative,” the “+” operation should not be added recklessly during the derivations merely due to this physical meaning. For more general problems, such practice can lead to mistake in the solution.
Another important step in using waterfilling solutions in communications systems is to obtain the solution values, i.e., the values of ’s of the solution in (3). It is a nontrivial step. The existing algorithms are usually for specific applications and a unified framework is missing.
To obtain the values of the ’s from (3), a practical waterfilling algorithm is needed. The major challenge is to find the index set of active subchannels with nonzero powers, i.e.,
(5) 
In general, all possible subchannel combinations need to be tried. For each of the possibilities, the corresponding values can be found, and then the one with the highest objective function among the possibilities is the optimal solution. But this method is obviously inefficient as the complexity is exponential in . For simple settings with simple functions and fortunate parameter values, a natural ordering of the subchannels exists, and the algorithm can be designed to have a lower complexity, where the number of possible active sets to be explored has the order of .
In [32], this class of optimization problems in P1 was studied for the case where the optimization variables are nonnegative integers. For integer variables, the derivatives in (4) do not apply and the waterfilling solution no long exists.
IiB New Viewpoint and Algorithm
The traditional method for P1 as explained in the previous subsection has two major disadvantages. The first is the need of the transformation from KKT conditions to waterfilling solutions. As the problem gets more general for more involved wireless systems and models, the transformation can become intractable. The second is the lack of general and effective algorithms in finding the values of the solution. In the following, from the perspective of a dynamic procedure, we give a new viewpoint on the solution of the optimization problem, which helps address both challenges.
Since is concave, is a decreasing function meaning that the increasing rate of decreases as increases. The optimization problem P1 aims at allocating the total power over a series of functions, i.e., ’s. We can see this problem as dividing the available power into a large number of small portions and the power is allocated portion by portion. For each portion, we should choose the subchannel whose has the maximum increasing rate to maximize the total of ’s. As the increasing rate of this function decreases when a resource portion is added to it, after getting a certain amount of power portions, its increasing rate may become smaller than another subchannel. In this case, a new subchannel will have the fastest increasing rate and the next power portion should be added to this new subchannel. This procedure repeats until all resource portions have been allocated. When the resource allocation stops, the functions that are allocated with nonzero powers will have the same increasing rate. Some subchannels may never get any power portion if their increasing rates are never the highest.
The result discussed above is presented in the following claim with rigorous proof.
Lemma 1.
The following conditions are both necessary and sufficient for the optimal solution of P1:
(6) 
Proof.
We first prove the necessity part by contradiction. The necessity of the last line of (6) is obvious and has been proved in many existing work. Thus the proof is omitted here. Denote the optimal solution of P1 as . Assume without loss of generality that (i.e., ) but . Since and are continuous, there exists an with such that for . Thus
This shows that the new solution (which satisfies all constraints by construction) is better than , which contradicts the assumption. This proves that the first line of (6) is necessary.
Similarly, to prove that the second line of (6) is necessary, assume without loss of generality that (i.e., and ) but . Since is strictly concave and is continuous, there exists an with such that for . Thus
This says that the solution is better and thus leads to a contradiction.
For the sufficiency, it is enough to show that a solution satisfying (6) is a local maximum. Since P1 is a convex optimization, its local maximum is unique and is the global maximum. Let be the solution satisfying and consider a solution in the vicinity of it. Define and . Notice that , where . Thus
From the conditions on ’s and the assumption that satisfies (6), we have
(7) 
for all , , , . Also, since , we have
(8) 
By combining (7) and (8), it can be concluded that , and thus is a local maximum.^{1}^{1}1The lemma can also be proved by showing that (6) is equivalent to the KKT conditions, which are necessary and sufficient for P1. But here we use a direct proof to help illustrate the proposed new viewpoint and avoid unnecessary dependence on existing waterfilling results.
∎
From (6), we see that the value of for , denoted as , is the increasing rate for the optimal power allocation result. The allocated power on the subchannels can also be represented as functions of :
(9) 
where is defined in (4). From the total power constraint,
(10) 
based on which can be solved when the set of active subchannels is known.
As explained in the previous subsection. The main difficulty of finding the solution values is to find . We propose the use of index operations ’s to conquer this difficulty. When subchannel is allocated nonzero power, , otherwise . Based on these indices, (9) and (10) are rewritten as
(11) 
With this result, we present a new waterfilling algorithm for P1 in Algorithm 1.
In the first step of Algorithm 1, all subchannels are initialized as active and in the second step, the corresponding increasing rate and subchannel powers are calculated. As the computations of ’s do not consider the constraints that , it may appear that for some . In this case, the corresponding index will be set to zero and this subchannel will be allocated zeropower in the next round. The procedure continues until all active subchannels are allocated nonnegative powers.
Lemma 2.
Algorithm 1 converges and achieves the optimal solution of P1.
Proof.
Since for each iteration in Algorithm 1, the new set for is either the same as the previous (thus the algorithm terminates) or shrinks to a subset of the previous . As the size of the initial set is , it is obvious that the algorithm converges within iterations.
Now we prove that Algorithm 1 converges to the optimal solution of P1. First, since , at any iteration, it is impossible to have for all . In other words, there exists a such that . Let be the solution found by Algorithm 1 at the th iteration. From Step 2 and Step 6, it is obvious that the solution satisfies the first and last conditions of (6). For any , we have in one of the previous iterations. Denote the iteration round for as . Thus from (11), , from which , where is the achieved increasing rate at the th iteration. Notice that for subchannel in the active set of the th iteration. With the proposed algorithm, subchannel is removed by setting , and in the next iteration, the sum power available for the remaining active subchannels decreases. The achieved increasing rate for this new iteration is higher, i.e., . Denote the overall iteration number for the algorithm as . Since , we have for . This proves that the solution found by the algorithm also satisfies the second condition of (6). As (6) is proved to be sufficient for the optimal solution in Lemma 1, the solution found by Algorithm 1 is thus the optimal one.∎
Remark: When the inverse functions in (4) and can be derived in closedforms, the waterfilling solution and Algorithm 1 are in closedform. For each iteration of Algorithm 1, the complexity of the calculations of and is . Since there are at most iterations, the worsecase complexity of Algorithm 1 is . Otherwise when the inverse functions or does not have closedform, a numerical method such as bisection search or grid search is needed for an approximate solution. The complexity of Algorithm 1 depends on the numerical algorithm and precision. With respect to , the complexity order is still .
IiC Comparison and Application Examples
The proposed new method, including the viewpoint and the algorithm, does not require manipulation of the KKT conditions into a format of waterfilling solutions. Further, the proposed algorithm is general and has lowcomplexity with the worstcase number of iterations being . On average, the number of iterations can be much smaller than since the proposed algorithm allows multiple channels to be made inactive in each iteration as long as their positivity constraints cannot be satisfied. For the traditional scheme, in general, all possible subsets of active subchannels need to be tested, whose complexity is exponential in . For special cases when an ordering among the subchannel exists, the complexity can be reduced to , which is still higher than the complexity of the proposed one. In what follows, examples are provided to better elaborate the difference and advantages of the proposed method.
Example 1: A general weighted sum capacity maximization problem has the following form:
(12) 
where , and are arbitrary nonnegative parameters.
With our proposed scheme, we first obtain from the objective function in (IIC)
(13) 
Then the solution values can be found by Algorithm 1 within iterations. Specifically, from (11),
(14) 
The calculations in Step 2 and Step 6 can be achieved straightforwardly using (13) and (14).
With the traditional scheme, via calculations, the following waterfilling solution is obtained:
(15) 
Though in compact neat form, to find the values of the optimal ’s is not selfexplanatory. All possible active subchannel sets need to be tried to find the best one. Compared with the algorithm in [32], our algorithm is different as our algorithm does not need to order the product of weighting factor and channel gains.
Example 2: A general weighted MSE minimization problem can be written in the following form:
(16) 
where , and are arbitrary nonnegative parameters.
With the proposed scheme, we first obtain from the problem
(17) 
Similarly, Algorithm 1 can be used to find the solution values. Specifically, from (11),
(18) 
(17) and (18) can be used straightforwardly for the calculations in Steps 2 and 6.
With the traditional scheme, via calculations, the following waterfilling solution is obtained as the first step:
(19) 
The same difficulty as in Example 1 appears here. Though (19) is in compact neat form, it is unclear how to find the values of the optimal solution from it. In general all possible active subchannel sets need to be tried to find the best one whose complexity is exponential in . Ordering of the subchannels is only possible with stringent ordering conditions on the parameters, e.g., and can be ordered decreasingly simultaneously.
Example 3: The capacity maximization for dualhop MIMO amplifyandforward relaying networks can be casted as follows [42, 37]:
(20) 
where are nonnegative and .
With our proposed scheme, we can obtain from the objective function of the problem
Then the solution values can be found by Algorithm 1. But for this case, to find the value of (for Steps 2 and 6), numerical bisection search is needed to solve the following equation
With the traditional scheme, via some calculations, the following waterfilling solution is obtained as the first step:
But algorithms to find the waterfilling solution values were not explicitly provided in existing literature.
Example 4: A weighted mutual information maximization problem for the training design can be written in the following format [38]:
(21) 
To use the proposed scheme, we first get from the objective function
(22) 
Due to the complexity of , the function does not have an explicit closedform. But Algorithm 1 can still be used to find the solution values by calculating and ’s numerically in Steps 2 and 6 using (4), (11), and (22).
With the traditional scheme, via calculations, the following waterfilling solution is obtained in the first step:
Unlike Examples 13, the KKT conditions for this example cannot be written in a compact waterfilling solution form by using the “+” operation and no efficient algorithm was available in the literature to find the solution values.
Example 5: A weighted MSE minimization problem for training optimization can be formulated as follows:
(23) 
To use the proposed scheme, we first get from the objective function
Again, the function does not have an explicit closedform, but Algorithm 1 can still be used to find the solution values by calculating and ’s numerically in Steps 2 and 6.
With the traditional scheme, similar to Example 4, the KKT conditions can be obtained but a compact waterfilling solution form has not been found with the “+” operation, nor have efficient algorithms been proposed to find the solution values in the literature.
IiD Problems with Arbitrary LowerBound Constraints
In this subsection, we consider the extension of the optimization problem P1 with arbitrary lower bounds on the subchannel powers:
(24) 
where and ’s are realvalued, increasing, and strictly concave functions with continuous derivatives.
In P1.1, each subchannel is limited with a nonnegative lower bound for its power, while for P1, the lower bounds are zero for all subchannels. For this more general case, define the active set as the set of subchannels whose powers are higher than their lower bounds, i.e.,
The following lemma is obtained.
Lemma 3.
The following conditions are both necessary and sufficient for the optimal solution of P1.1:
(25) 
Proof.
The proof is very similar to that of Lemma 1, thus omitted. ∎
For the algorithm design, the index operation is introduced as follows: when the power of subchannel is larger than its lower bound, i.e., ; otherwise . Let for , which is the increasing rate for active subchannels. Via similar studies to those in Section IIB, the optimal solution of P1.1 can be represented as follows:
(26) 
Notice that (11) is a special case of (26) where . Algorithm 2 is proposed to find the solution values for P1.1.
Lemma 4.
Algorithm 2 converges and achieves the optimal solution of P1.1.
Proof.
The proof is similar to that of Lemma 2, thus omitted. ∎
In each iteration of Algorithm 2, subchannels whose powers are less than their required lower bounds are removed from the iteration (i.e., are put in the inactive set) and their powers are enforced to be the corresponding lower bounds, i.e., . Since these subchannels are allocated smaller powers than their lower bounds, their increasing rates at the lower bounds are smaller than other subchannels. After being removed, fewer power resources are available for the remaining active subchannels. After power allocation among the remaining subchannels in Step 6, the powers of the active subchannels decrease, and thus their increasing rates will increase. Therefore, the removed subchannels cannot enter the competition for power in future iterations. This explains the convergence and optimality of the algorithm intuitively. The worse case complexity order of Algorithm 2 is exactly the same as that of Algorithm 1, which is .
IiE Discussions on More General Cases
The new viewpoint and method can be extended to solve more general optimization problems in wireless communications. Consider the following convex optimization problem:
(27) 
where and ’s are realvalued, increasing, and strictly concave functions with continuous derivatives.
The difference of P2 to the original one P1 is in the constraints ’s. As P2 is convex, the following KKT conditions are necessary and sufficient for the optimal solution [2]:
(28) 
where and ’s are the Lagrange multipliers corresponding to the sum power constraint and persubchannel constraints, respectively.
There are no unified methods or efficient algorithms to find the solution values of P2 in the literature. By following the ideas proposed in previous subsections, we can solve this challenging problem by considering two situations: 1) all conditions ’s are inactive (i.e., not satisfied with equality) and 2) at least one of ’s is active (i.e., satisfied with equality). The first situation leads to the same solution as P1. For the second one, the results for P1 can be applied for the power allocation among subchannels with inactive ’s and solutions for subchannels with active ’s can be found by solving . In the following sections, we will solve the generalized problem considering several different cases.
Remark: The difference between our work and [32]
can be summarized as the difference between geometric and algebraic viewpoints. Each cannot include the other as its special case and each has its own advantages and characteristics. Compared with the geometric logic, our logic has less geometric meanings. On the other hand, with the algebraic viewpoint, our method can cover more mathematical formulations and tries to give a unified way for a broad range of waterfilling solutions and waterfilling algorithms.
Iii Problem with Box Constraints
In this section, we consider a special case of P2 in which and . Equivalently, the optimization problem is as follows:
(29) 
where , , and ’s are realvalued, increasing, and strictly concave functions with continuous derivatives.
Iiia Two Algorithms Built on Finding Subchannel Sets
Similar to the previous section, we can see this problem as dividing the available power into infinitesimally small portions and allocating them portion by portion. At the start of the allocation, Subchannel must have to satisfy the lower bound constraint. For each remaining portion, we should choose the subchannel whose has the maximum increasing rate i.e., , and whose power has not reached its upper bound to maximize the total of ’s. As the increasing rate of decreases when a power portion is added to it, after adding a portion to the subchannel with the maximum increasing rate, e.g., Subchannel , its increasing rate may become smaller to another subchannel. In this case, a new subchannel with the fastest increasing rate will have the next power portion. Otherwise, Subchannel gets the next power portion if it still has the maximum . This procedure repeats until all power portions have been allocated. Some subchannels may never get any extra power portion than the original lower bounds when their increasing rates are never the highest. Some subchannels may have the highest increasing rates but cannot get more power due to their upper bound constraints. When the allocation stops, subchannels which do not have active bounds must have the same increasing rate.
For a given feasible solution , denote
(30) 
which are the index sets of subchannels whose power values equal their lower bounds, upper bounds, and inbetween the two bounds (i.e., active subchannels), respectively. They are also sets of subchannels with active lower bounds, active upper bounds, and no active bounds. The following lemma provides the sufficient and necessary condition on the optimal solution of P3.
Lemma 5.
The following conditions are both necessary and sufficient for the optimal solution of P3:
(31) 
Proof.
The proof is similar to that of Lemma 1 with the following two changes: 1) the lower bounds change from 0 to ’s and 2) new upper bounds are added. Details are omitted to save space. ∎
The physical meaning of (31) is as follows. At the optimal solution, subchannels with inactive bounds have the same the increasing rate , which is also denoted as . Subchannels with active lower bounds have lower increasing rates than and subchannels with active upper bounds have higher increasing rates than .
Based on the viewpoint and conditions for the optimal solution of P3, we propose Algorithm 3 to find the solution values by using Algorithm 2 as a building block. The idea is to first consider the lower bound constraints only and use Algorithm 2 to find the corresponding solution. Then the subchannels whose power values are larger or the same as their upper bounds are reset as their upper bounds, and are removed from the next iteration. In the next iteration, power is allocated among the remaining subchannels using Algorithm 2 again. The process continues until the powers of all subchanels are smaller than their upper bounds at an iteration. Algorithm 3 has one more round of iteration than Algorithm 2. Thus its worse case complexity order is .
Algorithm 3 does not have balanced treatment between the lower bound constraints and the upper bound constraints. While subchannels that reach or violate their upper bound constraints are removed during the iterations, the ones reaching or violating their lower constraints stay in the ‘while’ loop and participate in the power allocation procedure with Algorithm 2. Another algorithm symmetrical to Algorithm 3 can also be designed by switching the roles of the lower and upper bound constraints.
Next, we consider both constraints jointly. Based on the aforementioned discussions, the key task is to determine the sets , , and defined in (30). We introduce two sets of indices ’s and ’s as follows:
where is the power allocated to subchannel , indicates whether the power of subchannel is larger than its lower bound constraint and indicates whether it is smaller than its upper bound constraint. For the index tuple , means the subchannel is an active one and neither constraints is tight; means the subchannel belongs to ; and means the subchannel belongs to . Similar to the previous section, let be the increasing rate of the active subchannels, and we have the following necessary conditions for P3 from (31):
(32) 
Algorithm 4 is proposed which follows the idea in Algorithm 2 with extensions for both lower and upper bound constraints. The worse case complexity order of Algorithm 4 is the same as that of Algorithm 2, i.e., .
By using results in Lemma 5 and following the proof in Lemma 4, the convergence and optimality of Algorithms 3 and 4 can be proved.
Proof.
The detailed proof is similar to that of Lemma 4 and is thus omitted to save space. ∎
Comments
There are no comments yet.