# Impulsive Control for G-AIMD Dynamics with Relaxed and Hard Constraints

Motivated by various applications from Internet congestion control to power control in smart grids and electric vehicle charging, we study Generalized Additive Increase Multiplicative Decrease (G-AIMD) dynamics under impulsive control in continuous time with the time average alpha-fairness criterion. We first show that the control under relaxed constraints can be described by a threshold. Then, we propose a Whittle-type index heuristic for the hard constraint problem. We prove that in the homogeneous case the index policy is asymptotically optimal when the number of users is large.

## Authors

• 24 publications
• 1 publication
• 174 publications
07/01/2018

### Asymptotically optimal delay-aware scheduling in wireless networks

In this paper, we investigate a channel allocation problem in networks t...
01/09/2020

### Regularity and stability of feedback relaxed controls

This paper proposes a relaxed control regularization with general explor...
04/25/2019

### Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

We approach the continuous-time mean-variance (MV) portfolio selection w...
09/22/2018

### Optimizing a Generalized Gini Index in Stable Marriage Problems: NP-Hardness, Approximation and a Polynomial Time Special Case

This paper deals with fairness in stable marriage problems. The idea stu...
03/16/2018

### A New Result on the Complexity of Heuristic Estimates for the A* Algorithm

Relaxed models are abstract problem descriptions generated by ignoring c...
11/15/2017

### A Stochastic Resource-Sharing Network for Electric Vehicle Charging

We consider a distribution grid used to charge electric vehicles subject...
10/16/2019

### Trends in the optimal location and sizing of electrical units in smart grids using meta-heuristic algorithms

The development of smart grids has effectively transformed the tradition...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

For nearly two decades Additive Increase Multiple Decrease (AIMD) mechanism was one of the main components in the TCP/IP protocol regulating data traffic across the Internet [25]. In the absence of significant queueing delay, AIMD increases the data sending rate linearly in time until packet loss and then drastically, in a multiplicative fashion, reduces the sending rate. However, in the most recent versions of TCP (Compound [26] in Windows and Cubic [12] in Linux), the linear growth function has been changed to non-linear functions to enable agile adaptation of the data sending rate. Such modifications can be viewed as particular cases of Non-linear AIMD (NAIMD) dynamics. The possibilities of non-linear modifications of AIMD are really endless. A thorough classification of NAIMD dynamics, together with the analysis of some NAIMD classes, can be found in the book [9]. Here we consider one important class of NAIMD dynamics, which we refer to as Generalized AIMD (G-AIMD) [6]. In the G-AIMD dynamics the acceleration of the sending rate in the increase phase depends on the current value of the rate.

The other important recent development in the Internet architecture is the introduction of Software-Defined Networking (SDN) technology [18]. The SDN technology allows much finer control of resource allocation (e.g., bandwidth allocation) in a network. Motivated by this opportunity, in the present work we study the control of G-AIMD dynamics. In the networking context, when allocating resource, it is very common to use some fairness function as optimization objective. In the foundational work [15], the authors proposed to use proportional fairness in the context of the network utility maximization problem. Then, in [22] the -fairness function was proposed, which generalizes the proportional fairness and gives max-min fairness and delay fairness as the other important particular cases. A very good review of the network utility maximization problem can be found in [24]. Most of the works on the resource allocation problem concern with long-term fairness, which ignores instantaneous oscillations of the sending rate or short-term fairness. Short-term fairness is particularly important in wireless and electrical networks. Following [2], in this work we optimize the integral of the -fairness function over time, which represents short-term fairness.

We would like to note that recently AIMD and more generally NAIMD found new applications in smart electrical grids [10, 13, 17] and in power control for charging electric vehicles [9, 11, 23]. We hope that our findings will also be useful in these application domains.

Let us specifically describe our contributions: in the next section we formulate the problem of short-term -fairness for resource allocation among G-AIMD users as an impulsive control problem under constraints with time average criterion. We would like to note that our impulsive control is different from the standard impulsive control setting [21], where there is a constraint on the number of impulses or on the total variation of the impulse control. In our case, we have only a constraint on the system state. The present work also represents an advance with respect to our previous work [6], where we have not considered the setting with constraints. Here we consider both hard and relaxed constraints. In Section III we show that in the case of the relaxed constraints, the optimal impulsive control of the G-AIMD dynamics can be given in the threshold form. Then, in Section IV we propose a heuristic, which is similar in spirit to the celebrated Whittle index [28]. We would like to note that in the past several attempts to prove indexability of AIMD [8, 14] and G-AIMD [5] dynamics have been made. However, to the best of our knowledge, it is for the first time that we prove the indexability of the G-AIMD dynamics without any artificial conditions. We were able to make this theoretical advance largely thanks to the framework of impulsive control in the continuous time. The previous works on TCP indexability are all in discrete time, and some are also in the discrete state space but [5] is in the continuous state space. Similarly to [27], we are able to show that in the homogeneous case the index policy is asymptotic optimal in the regime of a large number of users. As a by-product, we prove the global stability of the AIMD dynamics and the local asymptotic stability of the G-AIMD dynamics under the index policy in the homogeneous setting. This extends the work [3] on the reduce max rate policy, where only the existence and uniqueness of a fixed point was shown but the stability in the deterministic setting was not investigated. We conclude the paper in Section V with future research directions.

## Ii Model and problem formulation

Let us consider a Generalized Additive Increase Multiplicative Decrease (G-AIMD) dynamics with users in continuous time. In the absence of control signal, the allocation to user (e.g., transmission rate in Internet congestion control or instantaneous power in charging stations for electric vehicles) increases according to the differential equation:

 dxkdt=akxγkk, (1)

with and . Continuous-time models represent well the TCP sending rate evolution on the scale of several round-trip times [7, 29].

We consider impulsive control. Namely, when the control signal (impulse) is sent to user at time , the resource allocation to user drastically decreases according to

 xk(t+0)=bkxk(t), (2)

with . We note that the above dynamics is fairly general and covers at least three important particular cases: if we retrieve the classical Additive Increase Multiplicative Decrease (AIMD) mechanism [25], corresponds to Compound TCP [26] when queueing delays are not large, and corresponds to the Multiplicative Increase Multiplicative Decrease (MIMD) mechanism or Scalable TCP [16]. MIMD is a very aggressive dynamics [1] and, in contrast, AIMD is much more gentle. Compound TCP is designed to represent a good balance between the two extremes.

Let us define formally a class of policies slightly larger than the class of purely deterministic policies. The need for such a class of policies will be clear from the subsequent development.

###### Definition 1
• Let be fixed. For user a policy is a sequence, say

of time moments, when an impulse (a multiplicative decrease in his sending rate) is applied. Here

is a monotone nondecreasing sequence of constants in . It is possible that multiple impulses are applied at a single time moment, but we require .

• Let and be two policies for user . Then, for each we denote by

a mixture of the two policies, which with probability

chooses the sequence and with the complementary probability chooses the sequence . For user , we denote by (resp., ) the set of policies (resp. all such mixed policies for all ).

• We introduce the notation and

Let be fixed. Each policy defines the dynamics of (stochastic, if ), and the corresponding expectation is denoted by .

Let us denote by

the vector of resource allocations at time

. Ideally, at each time moment we aim to operate the system under the constraint:

 N∑k=1xk(t)≤c,∀t, (3)

where is the resource (e.g., transmission capacity or electric power). It appears that if we substitute the above hard constraint with a soft time-averaged constraint, the problem becomes more tractable. Namely, consider

 N∑k=1limsupτ→∞1τEuk[∫τ0xk(t)dt]≤c.

Our first objective is to propose an impulsive control in closed form, which solves the following constrained problem:

 J(u):=N∑k=1liminfτ→∞1τEuk[∫τ0x1−αk(t)1−αdt]⟶supu∈U, subject to:~{}N∑k=1limsupτ→∞1τEuk[∫τ0xk(t)dt]≤c. (4)

Here the initial state is arbitrarily fixed, and will not be indicated, and the generic notation is in use. For each is the short-term -fairness [2]. The short-term -fairness is a versatile fairness concept, which retrieves as particular cases: proportional fairness (), delay-based fairness () and max-min fairness ().

In order to deal with the control problem under constraints, we use the multiobjective optimization approach. To this end, let us define the two competing objectives:

 J(u)=N∑k=1Jk(uk) := G(u)=N∑k=1Gk(uk) :=

It appears that it is more convenient to consider instead of , which leads to the standard multiobjective problem:

 −J(u) → infu∈U,G(u) → infu∈U.

Throughout this paper, we assume the following

###### Assumption 1

for each

The particular cases excluded by the assumption can be separately analyzed using similar techniques. We exclude such cases for the sake of presentation smoothness and because of space limitation.

## Iii Control in the relaxed case

Let us formally justify the reduction of the problem with the relaxed constraint to the multiobjective formulation and demonstrate how the original solution can be reconstructed.

To scalarize the multiobjective problem, we introduce the variable weight and consider the combined criterion

 L(λ,u)=N∑k=1Lk(λ,uk) := N∑k=1(−Jk(uk)+λGk(uk))→infu∈U.

Note that the above problem reduces to subproblems: for each

 −Jk(uk)+λGk(uk)→infuk∈Uk. (5)
###### Lemma 1

For each and an optimal policy for problem (5) is of threshold type with the threshold given by

 ¯xk(λ)=⎧⎨⎩(2−γk)(1−b2−α−γkk)(1−b2−γkk)(2−α−γk)λ⎫⎬⎭1α. (6)

In greater details, under this threshold policy, the user decreases the sending rate at time as soon as . (It is clear that this threshold policy induces a policy in )

Proof. Let be fixed. As was shown in [6] (see there Theorem 3.1), the policy say defined by the threshold given by (6) is optimal to the following problem

 limsupτ→∞1τEuk[∫τ0(−x1−αk(t)1−α+λxk(t))dt]→infU′k. (7)

It is clear that this policy is also optimal to the above problem but out of all the policies i.e.,

 limsupτ→∞1τEuk[∫τ0(−x1−αk(t)1−α+λxk(t))dt]→infUk.

(In fact, if it is outperformed by a mixed policy, then there must be another deterministic policy outperforming this threshold policy, which contradicts the optimality of the threshold policy out of ) Then

 −Jk(u∗k)+λGk(u∗k) = limτ→∞1τEu∗k[∫τ0(−x1−αk(t)1−α+λxk(t))dt] ≤ limsupτ→∞1τEuk[∫τ0(−x1−αk(t)1−α+λxk(t))dt] ≤ −Jk(uk)+λGk(uk)

for each .

###### Lemma 2

For each and under the threshold policy given by (6), if

 −J∗k(λ):=−Jk(u∗k) = −[¯xk(λ)]1−α(1−b2−α−γkk)(1−γk)(1−α)(2−α−γk)(1−b1−γkk), G∗k(λ):=Gk(u∗k) = ¯xk(λ)(1−b2−γkk)(2−γk)(1−γk)(1−b1−γkk), L∗k(λ):=Jk(u∗k)+λGk(u∗k) = −¯xk(λ)λα1−α(1−b2−γkk)(2−γk)(1−γk)(1−b1−γkk);

and if

 −J∗k(λ)=[¯xk(λ)]1−α1−b1−αk(1−α)2lnbk, G∗k(λ)=¯xk(λ)bk−1lnbk, L∗k(λ)=−¯xk(λ)λα1−αbk−1lnbk.

Proof. The details needed for the derivation of the above objectives can be found in [6].

Let us now investigate the trade off against . We consider two cases (a) and (b) separately. The following two observations hold for

• : by equation (6), if then and consequently ; at the same time, and consequently . Now if then and ; and at the same time and consequently .

• : Again by equation (6), if then and consequently . However, in this case and consequently . Now if then and ; and at the same time and consequently .

Next we establish the convexity of the epigraph.

###### Lemma 3

For each , legitimately regarded as a function of , is convex. Moreover, its epigraph coincides with the convex hull of its graph.

Proof: To prove the convexity, it will be more convenient to consider the parametrization with respect to . We note that since there is a one-to-one correspondence between and , the two parametrizations are equivalent. Observe that

 −J∗k(¯xk)=−c1¯x1−αk1−α,c1>0, G∗k(¯xk)=c2¯xk,c2>0,

where the constants come from (2) (resp., (2)) when (resp. ). Thus, we can write

 G∗k(−J∗k)=c2[(−J∗k)(1−α)−c1]11−α.

Hence,

 dG∗kd(−J∗k)=−c2c1[(−J∗k)(1−α)−c1]α1−α,

and

 d2G∗kd(−J∗k)2=αc2c21[(−J∗k)(1−α)−c1]2α−11−α>0,

since always.

The last assertion follows from the two observations before this lemma: there is no asymptote if , and the same conclusion holds if , too. Examples of the epigraph in the two cases are displayed in Figures 1.(a) and 1.(b). This completes the proof.

###### Remark 1

For each denote by the convex hull of the graph (or equivalently the epigraph, according to the previous lemma) of as a function of . It can be seen that . Indeed, if there is some such that , it can only lie below the graph of against , but then for some it contradicts the fact that the threshold policy given by is optimal for problem (5) with the same This observation is important for the argument below.

Now we consider problem (II), and reformulate it in the space of performance vectors. That is, we reformulate

 {−J(u)=∑Nk=1(−Jk(uk))→infu∈U,G(u)−c=∑Nk=1Gk(uk)−c≤0,

as

 {−~J(ω)→infω∈Ω,~G(ω)−c≤0, (10)

where

 −~J(ω):=N∑k=1ω1k, ~G(ω)=N∑k=1ω2k,

and where

 ω={(ω1k,ω2k)}Nk=1∈Ω:=N∏k=1Ωk⊂R2N.

In fact, these two problems are equivalent because of the following. For each there exists some such that and , and conversely for each there exists some satisfying and ; recall Remark 1. However, the correspondence may be not one-to-one.

We shall effectively solve problem (10), whose optimal solution then induces one to problem (II).

The main statement is now in position.

###### Theorem 1

The following assertions hold.

• The set is convex in , the functions and on are convex and real-valued.

• There exists some such that , i.e., Slater’s condition for problem (10) is satisfied.

• The threshold policy is optimal for problem (II), where for each is induced by the threshold , with

 λ∗=1cα(∑k=1,…,N:γk≠1(1−γk)(2−γk)(1−b2−γkk)(1−b1−γkk) ×⎛⎝2−γk1−b2−γkk⎞⎠1/α⎛⎝1−b2−α−γkk2−α−γk⎞⎠1/α +∑k=1,…,N:γk=1(1−b1−αk1−α)1/α(1−bk)(1−α)/α(−lnbk))α

In the homogeneous case ( and ) the expression becomes even simpler:

 λ∗ = Nαcα(1−γ)α(1−b1−γ)α(1−b2−γ)α(2−γ)α ×(2−γ)(1−b2−γ)(1−b2−α−γ)(2−α−γ),

and, consequently,

 ¯xk(λ∗)=cN1−b1−γ1−γ2−γ1−b2−γ, (12)

in case and

 λ∗ = 1cα(N(1−b1−α)1/α(b−1)((1−b)(1−α))1/αlnb)α.

in case

Proof. Part (a) is evident. For part (b), note that one can take such that

 ∀k∈{1,...,N},^ω2k

This is possible because approaches zero when . Thus, Slater’s condition is satisfied.

The rest of this proof verifies part (c). For each let be generated by the threshold policy determined by the threshold , We solve

 ~G(ω∗(λ))=c (13)

for given by (1). Then it holds that

 −~J(ω∗(λ∗))+λ∗(~G(ω∗(λ∗))−c) (14) ≤ −~J(ω)+λ∗(~G(ω)−c), ∀ ω∈Ω,

by Lemma 1. According to Theorem 1 of Section 8.4 in [19], this shows that solves problem (10). Part (c) immediately follows.

Consider given by (1). According to (14) and that (13) is satisfied by , we see

 μ0=infω∈Ω{−~J(ω)+λ∗(~G(ω)−c)},

where

 μ0=inf−~J(ω), subject to ω∈Ω, ~G(ω)≤c.

Any constant satisfying the above equality with being replaced by is sometimes called a geometric multiplier for problem (10), see Definition 6.1.1 of [4]. The following result from [19, Thm. 1 in Sect. 8.3], see the proof therein, shows that is the unique geometric multiplier for problem (10).

###### Proposition 1

Let be a convex set. Let be a real-valued convex function on and be a real-valued convex function on . Assume the existence of a point for which . Let

 μ0=inff(ω), subject to ω∈Ω, G(ω)≤0, (15)

and assume is finite. Then there is a number such that

 μ0=infω∈Ω{f(ω)+λ′G(ω)}, (16)

and thus a geometric multiplier exists. Furthermore, for each geometric multiplier if the infimum is achieved in (15) by an , , it is achieved by in (16) and

 λ′G(ω∗)=0.
###### Corollary 1

given by (1) is the unique geometric multiplier for problem (10)

Proof. Suppose is a geometric multiplier for problem (10). Let be as in the proof of Theorem 1. Let us verify that Suppose for contradiction that . Consider the case of Remember, is finite. However, since is a geometric multiplier,

 ∀k∈{1,...,N},infωk∈Ωk[ω1k]=−∞,

and

 μ0=infω∈Ω[−~J(ω)]=−∞,

which leads to a contradiction. Consider the case of . Then is strictly positive. However,

 ∀k∈{1,...,N},infωk∈Ωk[ω1k]=0,

and since is a geometric multiplier,

 μ0=infω∈Ω[−~J(ω)]=0,

Thus, According to Proposition 1, satisfies , which admits the unique solution

The above results call for a number of distributed control algorithms. At first, let us suppose that the numbers of users’ types are known to all users or broadcasted to the users by a central authority (e.g., SDN controller). Then, each user can calculate its threshold by (6),(1) and can control his rate by reducing it when the threshold is achieved. Thus, except for the complete initial knowledge of the system’s parameters, no further exchange of information is required.

Then, another interesting case is when each user knows its individual parameters but not the parameters of the other users. In this case, the central controller can calculate the Lagrange multiplier by equation (1) and distribute it to the users.

## Iv Index policy for hard constraint

Since is monotone and decreasing function of , the comparison of with provides the optimal solution for the relaxed problem formulation. What is more, the fact that is a monotone and decreasing function implies indexability of the problem with hard constraint [28].

Then, we can propose the following heuristic for the case of hard constraint [28]: whenever the hard constraint (3) is achieved, the user with the minimal value of reduces his rate. Let us call the resulting policy the Whittle-type index policy or briefly the index policy.

It is very intriguing to observe that the expression for contains neither the parameters of the other users nor the number of users. Therefore, may be the Whittle index type approach can be very useful in the adaptive scenario when the number of users changes with time.

From now on, in this section, we consider the homogeneous case, i.e., we suppose and for each . This is the standard first step in the analysis of index policies [27]. As previously, Assumption 1 is supposed to hold without explicit references. It is without loss of generality to assume .

Let be the index policy. Let be the threshold policy obtained in Theorem 1, which is optimal for problem (II). Note that the index policy satisfies the hard capacity constraint (3). Therefore, denoting as the class of policies satisfying the hard capacity constraint (3), one has

 J(uind,x,c,N)≤supu∈UHJ(u,x,c,N)≤J(u∗,x,c,N),

for each initial state , capacity constraint , and the number of users , which we signify in this section for the following reason. Our objective is to show that the index policy is asymptotically optimal in the following sense:

 limN→∞1NJ(uind,x,cN,N)=limN→∞1NJ(u∗,x,cN,N). (17)

In the important case of corresponding to the AIMD dynamics, we show that the index policy is asymptotically optimal for each initial state, and in case of , we show that it is asymptotically optimal for the initial states close enough to the steady state.

### Iv-a The AIMD (γ=0) case

Suppose and for each .

We first observe that since is monotone and decreasing in the homogeneous case the index policy is equivalent to the policy that reduces the maximal sending rate at the moment when the hard constraint is achieved. Let us now consider, under the index policy, the sequence of the sending rates, observed at each time when the capacity constraint is met. Following [3], for each such that

 ~x≥~x2≥⋯≥~xN>0; N∑i=1~xi=c, (18)

we introduce

 g(~x):=(g1(~x),…,gN(~x))

defined in the following way. If

 ~xk≥b~x1>~xk+1 (19)

for some with the convention then

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩g1(~x):=~x2+b1~x1,⋮gk−1(~x):=~xk+b1~x1,gk(~x):=b~x1+b1~x1,gk+1(~x):=~xk+1+b1~x1,⋮gN(~x):=~xN+b1~x1,

where the last two lines are not relevant if and is a constant. Note that if denotes the time duration since the reduction of according to the index policy until the next time when the hard capacity constraint is met, then

 b1~x1=(1−b)~x1N=aΔ(~x). (20)

The interpretation of is the vector of the ordered sending rates from the largest to the smallest one, when the next time the hard capacity constraint is met (before the reduction), starting from Put , with is a fixed vector satisfying (18),

 ~x(m):=g(~x(m−1))=:g(m)(~x), m≥1,

we are interested in as Let us introduce for each vector satisfying (18) as the integer satisfying (19).

###### Theorem 2

Suppose and for each . Then the mapping has a unique fixed point, say , in the space of vectors satisfying (18), given by

 ~x∗n=(b+(N−n+1)(1−b)N)cNb+(N+1)(1−b)2, (21)

, and as

Proof. Firstly, note that there exists some integer such that for otherwise, the sending rate of some user would have blown up to , violating the hard capacity constraint.

Next, observe that if for some then as well. Indeed, if this was not the case, then we would have

 b~x(m+1)1=b(~x(m)2+b1~x(m)1)>b~x(m)1+b1~x(m)1

and thus

 0≥b(~x(m)2−~x(m)1)>b1(~x(m)1−b~x(m)1)>0,

which is a desired contradiction. Therefore, for some and all subsequent steps, the maximal sending rate (before reduction) when the hard capacity constraint is met will become the minimal sending rate (just after reduction).

Consequently, for all large enough we have

 ~x(m+1)=g(~x(m)) = ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝b110…0b101…0⋮⋮⋮…⋮b100…1b1+b00…0⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠~x(m),

Since the matrix

 A=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝b110…0b101…0⋮⋮⋮…⋮b100…1b1+b00…0⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠

is an aperiodic irreducible (column) stochastic matrix, we conclude that

converges to the unique fixed point of in the space of vectors satisfying (18).

Let us compute the fixed point by solving the following system:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩~x∗1:=~x∗2+b1~x∗1,⋮~x∗2:=~x∗3+b1~x∗1,⋮~x∗N:=b~x∗1+b1~x∗1,∑Ni=1~x∗i=c,

which gives

 {~x∗n:=(b+(N−n+1)b1)~x∗1, 1≤n≤N,∑Ni=1~x∗i=c,

Therefore,

 ~x∗1=c∑Ni=1(b+ib1)=cNb+(N+1)(1−b)2, ~x∗n=(b+(N−n+1)(1−b)N)cNb+(N+1)(1−b)2, 2≤n≤N,

see (21).

We remark that the reduce maximal sending rate policy was investigated in [3]. There only the existence and uniqueness of the fixed point (21) was established but the convergence or the absence of cycling behaviour was not shown.

Next, we shall scale the capacity constraint by a multiplicative constant . When we do such scaling, it is convenient to signify the dependence of