# Online Primal-Dual Algorithms with Configuration Linear Programs

Non-linear, especially convex, objective functions have been extensively studied in recent years in which approaches relies crucially on the convexity property of cost functions. In this paper, we present primal-dual approaches based on configuration linear programs to design competitive online algorithms for problems with arbitrarily-grown objective. This approach is particularly appropriate for non-linear (non-convex) objectives in online setting. We first present a simple greedy algorithm for a general cost-minimization problem. The competitive ratio of the algorithm is characterized by the mean of a notion, called smoothness, which is inspired by a similar concept in the context of algorithmic game theory. The algorithm gives optimal (up to a constant factor) competitive ratios while applying to different contexts such as network routing, vector scheduling, energy-efficient scheduling and non-convex facility location. Next, we consider the online 0-1 covering problems with non-convex objective. Building upon the resilient ideas from the primal-dual framework with configuration LPs, we derive a competitive algorithm for these problems. Our result generalizes the online primal-dual algorithm developed recently by Azar et al. for convex objectives with monotone gradients to non-convex objectives. The competitive ratio is now characterized by a new concept, called local smoothness --- a notion inspired by the smoothness. Our algorithm yields tight competitive ratio for the objectives such as the sum of ℓ_k-norms and gives competitive solutions for online problems of submodular minimization and some natural non-convex minimization under covering constraints.

## Authors

• 7 publications
• ### Primal-dual block-proximal splitting for a class of non-convex problems

We develop block structure adapted primal-dual algorithms for non-convex...
11/14/2019 ∙ by Stanislav Mazurenko, et al. ∙ 0

• ### Adwords in a Panorama

Three decades ago, Karp, Vazirani, and Vazirani (STOC 1990) defined the ...
09/09/2020 ∙ by Zhiyi Huang, et al. ∙ 0

• ### Chasing Convex Bodies with Linear Competitive Ratio

We study the problem of chasing convex bodies online: given a sequence o...
05/28/2019 ∙ by C. J. Argue, et al. ∙ 0

• ### The Primal-Dual method for Learning Augmented Algorithms

The extension of classical online algorithms when provided with predicti...
10/22/2020 ∙ by Étienne Bamas, et al. ∙ 0

• ### Robust Algorithms for Online Convex Problems via Primal-Dual

Primal-dual methods in online optimization give several of the state-of-...
11/03/2020 ∙ by Marco Molinaro, et al. ∙ 0

• ### Optimizing Generalized Rate Metrics through Game Equilibrium

We present a general framework for solving a large class of learning pro...
09/06/2019 ∙ by Harikrishna Narasimhan, et al. ∙ 0

• ### Competitive Online Algorithms for Resource Allocation over the Positive Semidefinite Cone

We consider a new and general online resource allocation problem, where ...
02/05/2018 ∙ by Reza Eghbali, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the paper, we consider problems of minimizing the total cost of resources used to satisfy online requests. One phenomenon, known as economy of scale, is that the cost grows sub-linearly with the amount of resources used. That happens in many applications in which one gets a discount when buying resources in bulk. A representative setting is the extensively-studied domain of sub-modular optimization. Another phenomenon, known as diseconomy of scale, is that the cost grows super-linearly on the quantity of used resources. An illustrative example for this phenomenon is the energy cost in computation where the cost grows super-linearly, typically as a convex function. The phenomenon of diseconomy of scale has been widely studied in the domain of convex optimization [14]. Non-convex objective functions appears in various problems, ranging from scheduling, sensor energy management, to influence and revenue maximization, and facility location. For example, in scheduling of malleable jobs on parallel machines, the cost grows as a non-convex function [31] which is due to the parallelization and the synchronization. Besides, in practical aspect of facility location, the facility costs to serve clients are rarely constant or simply a convex function of the number of clients. Apart of some fixed opening amount, the cost would initially increase fast until some threshold on the number of clients, then becomes more stable before quickly increases again as the number of clients augments. This behaviour of cost functions widely happens in economic contexts. Such situations raises the demand of designing algorithms with performance guarantee for non-convex objective functions. In this paper, we consider problems in which the cost grows arbitrarily with the amount of used resources.

### 1.1 A General Problem and Primal-Dual Approach

#### General Problem.

In the problem, there is a set of resources and requests arrive online. At the arrival of request , a set of feasible strategies (actions) to satisfy request is revealed. Each strategy consists of a subset of resources in . Each resource is associated to a non-negative non-decreasing arbitrary cost function and the cost induced by resource depending on the set of requests using . The cost of a solution is the total cost of resources, i.e., where is the set of requests using resource . The goal is design an algorithm that upon the arrival of each request, selects a feasible strategy for the request while maintaining the cost of the overall solution as small as possible. We consider the standard competitive ratio as the performance measure of an algorithm. Specifically, an algorithm is -competitive if for any instance, the ratio between the cost of the algorithm and that of an optimal solution is at most .

#### Primal-Dual Approach.

We consider an approach based on linear programming for the problem. The first crucial step for any LP-based approach is to derive a LP formulation with reasonable integrality gap, which is defined as the ratio between the optimal integer solution of the formulation and the optimal solution without the integer condition. As the cost functions are non-linear, it is not surprising that the natural relaxation suffers from large integrality gap. This issue has been observed and resolved by Makarychev and Sviridenko [37]. Makarychev and Sviridenko [37] considered an offline variant of the problem in which the resource cost functions are convex. They systematically strengthen the natural formulations by introducing an exponential number of new variables and new constraints connecting new variables to original ones. Consequently, the new formulation, in form of a configuration LP, significantly reduces the integrality gap. Although there are exponentially number of variables, Makarychev and Sviridenko showed that a fractional -approximatly optimal solution of the configuration LP can be computed in polynomial time. Then, by rounding the fractional solution, the authors derived an -approximation algorithm for the resource cost minimization problem in which all cost functions are polynomial of degree at most . Here denotes the Bell number and asymptotically .

The rounding scheme in [37] is intrinsically offline and it is not suitable in online setting. Moreover, another issue in the problem is that cost functions are not necessarily convex. That represents a substantial obstacle since all currently known techniques for non-linear objectives relies crucially on the convexity of cost functions and Fenchel duality [22, 7, 38, 29, 30, 23, 8].

To overcome these difficulties, we consider a primal-dual approach with configuration LPs. First, primal-dual is particularly appropriate since one does not have to compute an optimal fractional solution that needs the full information on the instance. Second, in our approach, the dual variables of the configuration LP have intuitive meanings and the dual constraints indeed guide the decisions of the algorithm. The key step in the approach is to show that the constructed dual variables constitute a dual feasible solution. In order to prove the dual feasibility, we define a notion of smoothness of functions. This definition is inspired by the smoothness framework introduced by Roughgarden [42] in the context of algorithmic game theory to characterize the price of anarchy for large classes of games. The smoothness notion allows us not only to prove the dual feasibility but also to establish the competitiveness of algorithms in our approach. We characterize the performance of algorithms using the notion of smoothness in a similar way as the price of anarchy characterized by the smoothness argument [42]. Through this notion, we show an interesting connection between online algorithms and algorithmic game theory.

###### Definition 1

Let be a set of requests. A set function is -smooth if for any set and any collection , the following inequality holds.

 n∑i=1[f(Bi∪ai)−f(Bi)]≤λf(A)+μf(B)

A set of cost functions is -smooth if every function is -smooth.

Intuitively, given a -smooth function, the quantity measures how far the function is from being linear. If a function is linear then it is -smooth.

###### Theorem 1

Assume that all resource cost functions are -smooth for some parameters , . Then there exists a greedy -competitive algorithm for the general problem.

Note that, restricted to the class of polynomials with non-negative coefficients, our algorithm yields the competitive ratio of (consequence of Lemma 6) while the best-known approximation ratio is [37]. However, our greedy algorithm is light-weight and much simpler than that in [37] which involves in solving an LP of exponential size and rounding fractional solutions. Hence, our algorithm can also be used to design approximation algorithms if one looks for the tradeoff between the simplicity and the performance guarantee.

#### Applications.

We show the applicability of the theorem by deriving competitive algorithms for several problems in online setting, such as Minimum Power Survival Network Routing, Vector Scheduling, Energy-Efficient Scheduling, Prize Collecting Energy-Efficient Scheduling, Non-Convex Facility Location. Among such applications, the most representative ones are the Energy-Efficient Scheduling problem and the Non-Convex Facility Location problem.

In Online Energy-Efficient Scheduling, one has to process jobs on unrelated machines without migration with the objective of minimizing the total energy. No result has been known for this problem in multiple machine environments. Among others, a difficulty is the construction of formulation with bounded integrality gap. We notice that for this problem, Gupta et al. [29] gave a primal-dual competitive algorithm for a single machine. However, their approach cannot be used for unrelated machines due to the large integrality gap of their formulation. For these problems, we present competitive algorithms with arbitrary cost functions beyond the convexity property. Note that the convexity of cost functions is a crucial property employed in the analyses of previous work. If the cost functions have typical form then the competitive ratio is and this is optimal up to a constant factor for all the problems above. Besides, in offline setting, this ratio is close to the currently best-known approximation ratio [37].

In Online Non-Convex Facility Location, clients arrive online and have to be assigned to facilities. The cost of a facility consists of a fixed opening cost and and a serving cost, which is an arbitrary monotone function depending on the number of clients assigned to the facility. The objective is to minimize the total client-facility connection cost and the facility cost. This problem is related to the capacitated network design and energy-efficient routing problems [5, 36]. In the latter, given a graphs and a set of connectivity demands, the cost of each edge (node) is uniform and given by where is a fixed cost for every edge and is the total of flow passed through the edge (node). (Here uniformity means the cost functions are the same for every edge.) The objective is to minimize the total cost while satisfying all connectivity demands. Antoniadis et al. [5], Krishnaswamy et al. [36] have provided online/offline algorithms with poly-logarithmic guarantees. It is an intriguing open questions (originally raised in [3]) to design a poly-logarithmic competitive algorithm for non-uniform cost functions. The Online Non-Convex Facility Location can be seen as a step towards this goal. In fact, the former can be considered as the connectivity problem on a simple depth-2-graph but the cost functions are now non-uniform.

This problem is beyond the scope of general problem but we show that the resilient ideas from the primal-dual framework can be used to derive competitive algorithm. Specifically, we present a -competitive algorithm if the cost function is -smooth. The algorithm is inspired by the Fortakis primal-dual algorithm in classic setting [25] and our primal-dual approach based on configuration LPs. In particular, for the problem with non-uniform cost functions such as where are parameters depending on facility and is the number of clients assigned to facility , the algorithm yields a competitive ratio of .

### 1.2 Primal-Dual Approach for 0−1 Covering Problems

#### 0−1 Covering Problems.

We consider an extension of the general problem described in the previous section in which the resources are subject to covering constraints. Formally, let be a set of resources and let be an abitrary monotone cost function. Let be a variable indicating whether resource is selected. The covering constraints for every are revealed one-by-one and at any step, one needs to maintain a feasible integer solution . The goal is to design an algorithm that minimizes subject to the online covering constraints and for every .

Very recently, Azar et al. [8] have presented a general primal-dual framework when function is convex with monotone gradient. The framework, inspired by the Buchbinder-Naor framework [16] for linear objectives, crucially relies on Fenchel duality and the convexity of the objective functions. We overcome this obstacle for non-convex functions and also for convex functions with non-monotone gradients by considering configuration LP corresponding to the problem and multilinear extension of function . Given , its multilinear extension is defined as where is the characteristic vector of (i.e., the -component of equals if and equals 0 otherwise). Building upon the primal-dual framework in [8, 16] and the resilient ideas due to the primal-dual approach for the general problem described earlier, we present a competitive algorithm, which follows closely to the one in [8], for the fractional covering problem. Specifically, we introduce the notion of locally-smooth and characterize the competitive ratio using these local smoothness’ parameters.

###### Definition 2

Let be a set of resources. A differentiable function is -locally-smooth if for any set , the following inequality holds.

 (1)
###### Theorem 2

Let be the multilinear extension of the objective cost and be the maximal row sparsity of the constraint matrix, i.e., . Assume that is -locally-smooth for some parameters and . Then there exists a -competitive algorithm for the fractional covering problem.

Our algorithm, as well as the algorithm in [8] for convex with monotone gradients and the recent algorithm for -norms [40], are natural extensions of the Buchbinder-Naor primal-dual framework [16]. A distinguishing point of our algorithm compared to the ones in [8, 40] is that in the latter, the gradient at the current primal solution is used to define a multiplicative update for the primal whereas we use the gradient of the multilinear extension to define such update. This (rather small) modification, coupling with configuration LPs, enable us to derive a competitive algorithm for convex objective functions whose gradients are not necessarily monotone and more generally, for non-convex objectives. Moreover, the use of configuration LPs and the notion of local smoothness is twofold: (i) it avoids the cumbersome technical details in the analysis as well as in the assumptions of objective functions; (ii) it reduces the proof of bounding the competitive ratios for classes of objective functions to determining the local-smoothness parameters.

Specifically, we apply our algorithm to several widely-studied classes of functions in optimization. First, for the class of non-negative polynomials of degree , the algorithm yields a -competitive fractional solution that matches to a result in [8]. Second, for the class of sum of -norms, recently Nagarajan and Shen [40], based on the algorithm in [8], have presented a nearly tight -competitive algorithm where ’s are entries in the covering matrix. We show that our algorithm yields a tight -competitive ratio for this class of functions. (The matching lower bound is given in [15].) Third, beyond convexity, we consider a natural class of non-convex cost functions which represent a typical behaviour of resources in serving demand requests. Non-convexity represents a strong barrier in optimization in general and in the design of algorithms in particular. We show that our algorithm is competitive for this class of functions. Finally, we illustrate the applicability of our algorithm to the class of submodular functions. We make a connection between the local-smooth parameters to the concept of total curvature of submodular functions. The total curvature has been widely used to determines both upper and lower bounds on the approximation ratios for many submodular and learning problems [21, 27, 9, 44, 34, 43]. We show that our algorithm yields a -competitive fractional solution for the problem of minimizing a submodular function under covering constraints. To the best of our knowledge, the submodular minimization under general convering constraints has not been studied in the online computation setting.

### 1.3 Related work

In this section we summarize related work to our approach. Each problem, together with its related work, in the applications of the main theorems is formally given in the corresponding section.

In this paper, we systematically strengthen natural LPs by the construction of the configuration LPs presented in [37]. Makarychev and Sviridenko [37] propose a scheme that consists of solving the new LPs (with exponential number of variables) and rounding the fractional solutions to integer ones using decoupling inequalities. By this method, they derive approximation algorithms for several (offline) optimization problems which can formulated by linear constraints and objective function as a power of some constant . Specifically, the approximation ratio is proved to be the Bell number for several problems.

In our approach, a crucial element to characterize the performance of an algorithm is the smoothness property of functions. The smooth argument is introduced by Roughgarden [42] in the context of algorithmic game theory and it has successfully characterized the performance of equilibria (price of anarchy) in many classes of games such as congestion games, etc [42]. This notion inspires the definition of smoothness in our paper.

Primal-dual methods have been shown to be powerful tools in online computation. Buchbinder and Naor [16] presented a primal-dual method for linear programs with packing/covering constraints. Their method unifies several previous potential-function-based analyses and give a principled approach to design and analyze algorithms for problems with linear relaxations. Convex objective functions have been extensively studied in online settings in recent years, in areas such as energy-efficient scheduling [2, 41, 22, 32, 7], paging [38], network routing [29], combinatorial auctions [13, 30], matching [23]. Recently, Azar et al. [8] gave an unified framework for covering/packing problems with convex objectives whose gradients are monotone. Consequently, improved algorithms have been derived for several problems. The crucial point in the design and analysis in the above approaches relies on the convexity of cost functions. Specifically, the construction of dual programs is based on convex conjugates and Fenchel duality for primal convex programs. Very recently, Nagarajan and Shen [40] have considered objective functions as the of sum of -norms. This class of functions do not fall into the framework developped in [8] since the gradients are not necessarily monotone. By a different analysis, Nagarajan and Shen [40] proved that the algorithm presented in [8] yields a nearly tight -competitive ratio where ’s are entries in the covering matrix. In the approaches, it is not clear how to design competitive algorithms for non-convex functions. A distinguishing point of our approach is that it gives a framework to study non-convex cost functions.

#### Organization.

In Section 2, we present the framework for the general problem described in Section 1.1. The applications of this framework are in Appendix A. In Section 2, we give the framework for the 0-1 covering problems where some proof can be found in Appendix B. Technical lemmas, which will be used to determined smooth and local-smooth parameters, are put in Appendix C.

## 2 Primal-Dual General Framework

In this section, we consider the general problem described in Section 1.1 and present a primal-dual framework for this problem.

#### Formulation.

We consider the formulation for the resource cost minimization problem following the configuration LP construction in [37]. We say that is a configuration associated to resource if is a subset of requests using . Let be a variable indicating whether request selects strategy (action) . For configuration and resource , let be a variable such that if and only if for every request , for some strategy such that . In other words, iff the set of requests using is exactly . We consider the following formulation and the dual of its relaxation.

 min∑e,A fe(A)ze,A ∑j:sij∈Sixij =1 ∀i ∑A:i∈AzeA =∑j:e∈sijxij ∀i,e ∑AzeA =1 ∀e xij,zeA ∈{0,1} ∀i,j,e,A
 max∑iαi +∑eγe αi ≤∑e:e∈sijβie ∀i,j γe+∑i∈Aβie ≤fe(A) ∀e,A

In the primal, the first constraint guarantees that request selects some strategy . The second constraint ensures that if request selects strategy that contains resource then in the solution, the set of requests using must contain . The third constraint says that in the solution, there is always a configuration associated to resource .

#### Algorithm.

We first interpret intuitively the dual variables, dual constraints and derive useful observations for a competitive algorithm. Variable represents the increase of the total cost due to the arrival of request . Variable stands for the marginal cost on resource if request uses . By this interpretation, the first dual constraint clearly indicates the behaviour of an algorithm. That is, if a new request is released, select a strategy that minimizes the marginal increase of the total cost. Therefore, we deduce the following greedy algorithm.

Let be the set of current requests using resource . Initially, for every . At the arrival of request , select strategy that is an optimal solution of

 min∑e∈sij[fe(A∗e∪i)−fe(A∗e)]s.t.sij∈Si. (2)

Although computational complexity is not a main issue for online problems, we notice that in many applications, the optimal solution for this mathematical program can be efficiently computed (for example when ’s are convex and can be represented succinctly in form of a polynomial-size polytope).

#### Dual variables.

Assume that all resource cost are -smooth for some fixed parameters and . We are now constructing a dual feasible solution. Define as times the optimal value of the mathematical program (2). Informally, is proportional the increase of the total cost due to the arrival of request . For each resource and request , define

 βi,e:=1λ[fe(A∗e,≺i∪i)−fe(A∗e,≺i)]

where is the set of requests using resource (due to the algorithm) prior to the arrival of . In other words, equals times the marginal cost of resource if uses . Finally, for every resource define dual variable where is the set of all requests using (at the end of the instance).

###### Lemma 1

The dual variables defined as above are feasible.

Proof  The first dual constraint follows immediately the definitions of and the decision of the algorithm. Specifically, the right-hand side of the constraint represents times the increase cost if the request selects a strategy . This is larger than times the minimum increase cost optimized over all strategies in , which is .

We now show that the second constraint holds. Fix a resource and a configuration . The corresponding constraint reads

 −μλfe(A∗e)+1λ∑i∈A[fe(A∗e,≺i∪i)−fe(A∗e,≺i)]≤fe(A) ⇔ ∑i∈A[fe(A∗e,≺i∪i)−fe(A∗e,≺i)]≤λfe(A)+μfe(A∗e)

This inequality is due to the definition of -smoothness for resource . Hence, the second dual constraint follows.

###### Theorem 1

Assume that all cost functions are -smooth. Then, the algorithm is -competitive.

Proof  By the definitions of dual variables, the dual objective is

 ∑iαi+∑eγe=∑e1λfe(A∗e)−∑eμλfe(A∗e)=1−μλ∑efe(A∗e)

Besides, the cost of the solution due to the algorithm is . Hence, the competitive ratio is at most .

#### Applications.

Theorem 1 yields simple algorithm with optimal competitive ratios for several problems as mentioned in the introduction. Among others, we give optimal algorithms for energy efficient scheduling problems (in unrelated machine environment) and the facility location with client-dependent cost problem. Prior to our work, no competitive algorithm has been known for the problems. These applications can be found in Appendix A. The proofs are now reduced to compute smooth parameters that subsequently imply the competitive ratios. We mainly use the smooth inequalities in Lemma 6, developed in [20], to derive the explicit competitive bounds in case of non-negative polynomial cost functions.

## 3 Primal-Dual Framework for 0−1 Covering Problems

Consider the following integer optimization problem. Let be a set of resources and let be a monotone cost function. Let be a variable indicating whether resource is selected. The problem is to minimize subject to covering constraints for every constraint and for every . In the online setting, the constraints are revealed one-by-one and at any step, one needs to maintain a feasible integer solution .

We consider the multilinear extension of function defined as follows. Given , define its multilinear extension by where is the characteristic vector of (i.e., the -component of equals if and equals 0 otherwise). Note that . An alternative way to define is to set where is a random set such that a resource appears in

with probability

.

In this section we will present an online algorithm that outputs a competitive fractional solution for function subject to the same set of constraints. The rounding schemes depend on specific problems. For example, one can benefit from techniques from [8] in which rounding schemes have been given for several problems.

### 3.1 Algorithm for Fractional Covering

We recall that a differentiable function is -locally-smooth if for any subset of resources, the following inequality holds:

#### Formulation.

We say that is a configuration if corresponds to a feasible solution. Let be a variable indicating whether the resource is used. For configuration , let be a variable such that if and only if for every resource , and for . In other words, iff is the selected solution of the problem. For any subset , define and . We consider the following formulation and the dual of its relaxation.

 min∑S f(1S)zS ∑e∉Abi,e,A⋅xe ≥ci,A ∀i,A⊂E ∑S:e∈SzS =xe ∀e ∑SzS =1 xe,zS ∈{0,1} ∀e,S
 max∑i,Aci,Aαi,A +γ ∑i∑A:e∉Abi,e,A⋅αi,A ≤βe ∀e γ+∑e∈Sβe ≤f(1S) ∀S αi ≥0 ∀i

In the primal, the first constraints are knapsack-constraints corresponding to the given polytope. The second constraint ensures that if a resource is chosen then the selected solution must contain . The third constraint says that one solution (configuration) must be selected.

#### Algorithm.

Assume that function is -locally smooth. Let be the maximal number of positive entries in a row, i.e., . Consider the following Algorithm 1 which follows the scheme in [8] with some more subtle steps.

#### Dual variables.

Variables are constructed in the algorithm. Let be the current solution of the algorithm. Define and .

The following lemma gives a lower bound on -variables where the proof is given in Appendix B.

###### Lemma 2

Let be an arbitrary resource. During the execution of the algorithm, it always holds that

###### Lemma 3

The dual variables defined as above are feasible.

Proof  As long as a primal covering constraint is unsatisfied, the -variables are always increased. Therefore, at the end of a iteration, the primal constraint is satisfied. Consider the first dual constraint. The algorithm always maintains that (strict inequality happens only if ). Whenever this inequality is violated then by the algorithm, some -variables are decreased in such a way that the increasing rate of is at most 0. Hence, by the definition of -variables, the first dual constraint holds.

By definitions of dual variables and rearranging terms, the second dual constraint reads

 1λ∑e∈S∇eF(x)≤F(1S)+μ8λ⋅ln(1+2d2)F(x)

This inequality is exactly the -local smoothness.

We are now ready to prove the main theorem.

###### Theorem 2

Assume that the cost function is -locally-smooth. Then, the algorithm is -competitive.

Proof  We will bound the increases of the cost and the dual objective at any time in the execution of the algorithm. Let be the current set of resources such that . The derivative of the objective with respect to is:

 ∑e∇eF(x)⋅∂xe∂τ =∑e:bk,e,A∗>0xe<1∇eF(x)⋅bk,e,A∗⋅xe+1/d∇eF(x)≤∑e:bk,e,A∗>0(bk,e,A∗⋅xe+1d)≤2

For a time , let be the set of resources such that and . Note that by definition of . As long as (so ), for every , by Lemma 2, we have

 1bk,e,A∗>xe≥1maxibi,e,A∗⋅d[exp(ln(1+2d2))−1]

Therefore, .

We are now bounding the increase of the dual at time . The derivative of the dual with respect to is:

 ∂D∂τ =∑i∑Aci,A⋅∂αi,A∂τ+∂γ∂τ=∑ici,A∗⋅∂αi,A∗∂τ+∂γ∂τ =1λ⋅ln(1+2d2)(1−∑e∈U(τ)bk,e,A∗bm∗e,e,A∗)−μ8λ⋅ln(1+2d2)∑e∇eF(x)⋅∂xe∂τ ≥1λ⋅ln(1+2d2)(1−∑e∈U(τ)12d)−μ8λ⋅ln(1+2d2)∑e∉A∗:bi,e,A∗>0,xe<1(bi,e,A∗xe+1d) ≥1−μ4λ⋅ln(1+2d2)

The third equality holds since is increased and other -variables in are decreased. The first inequality follows the fact that . The last inequality holds since

 ∑e∉A∗:bi,e,A∗>0,xe<1(bi,e,A∗xe+1d)≤2

Hence, the competitive ratio is .

### 3.2 Applications

In this section, we consider the applications of Theorem 2 for classes of cost functions which have been extensively studied in optimization such as polynomials with non-negative coefficients, -norms and submodular functions. We are interested in deriving fractional solutions111Rounding schemes to obtaining integral solution for concrete problems are problem-specific and are not considered in this section. Several rounding techniques have been shown for different problems, for example in [8] for polynomials with non-negative coefficients, or using online contention resolution schemes for submodular functions [24]. with performance guarantee. We show that Algorithm 1 yields competitive fractional solutions for the classes of functions mentioned above and also for some natural classes of non-convex functions.

We first take a closer look into the definition of local smoothness. Let be a multilinear extension of a set function . By definition of multilinear extension, where is a random set such that a resource appears in with probability . Moreover, since is linear in , we have

 ∂F∂xe(x) =F(x1,…,xe−1,1,xe+1,…,xn)−F(x1,…,xe−1,0,xe+1,…,xn) =E[f(1R∪{e})−f(1R)]

where is a random subset of resources such that is included with probability . Therefore, in order to prove that is -locally-smooth, it is equivalent to show that

 ∑e∈SE[f(1R∪{e})−f(1R)]≤λf(1S)+μE[f(1R)] (3)

#### Polynomials with non-negative coefficients.

Let be a polynomial with non-negative coefficients and the cost function defined as where for every . The following proposition shows that our algorithm yields the same competitive ratio as the one derived in [8] for this class of cost functions. This bound indeed is tight [8] (up to a constant factor).

###### Proposition 1 ([8])

For any convex polynomial function of degree , there exists an -competitive algorithm for the fractional covering problem.

Proof  We prove that Algorithm 1 is -competitive for this class of cost functions. By Theorem 2, it is sufficient to verify that is -locally smooth. We indeed prove a stronger inequality than (3), that is for any set ,

 ∑e∈S[f(1R∪{e})−f(1R)]≤O((klnk)k−1)⋅f(1S)+k−1klnk⋅f(1R)

or equivalently, for any set ,

 ∑e∈S[g(ae+∑e′∈Rae′)−f(∑e′∈Rae′)]≤O((klnk)k−1)⋅g(∑e∈Sae)+k−1klnk⋅g(∑e′∈Rae′)

This inequality holds by Lemma 6. Hence, the proposition follows.

#### Norms functions.

Let be a function of weighted sum of -norms, i.e., where is a subset of resources and for . The cost function defined as . This class of functions does not fall into the framework of Azar et al. [8] since the corresponding gradient is not monotone. Very recently, Nagarajan and Shen [40] have overcome this difficulty and presented a nearly tight -competitive algorithm where ’s are entries in the covering matrix. The lower bound is [15], that holds even for -norm (linear costs). Using Theorem 2, we show that our algorithm yields tight competitive ratio for this class of functions.

###### Proposition 2

The algorithm is optimal (up to a constant factor) for the class of weighted sum of -norms with competitive ratio .

Proof  it is sufficient to verify that is -locally smooth. Again, we prove a stronger inequality than (3), that is for any index , for any set ,

 ∑e∈Swj[f(1R∪{e})−f(1R)]≤wjf(1S)⇔∑e∈Swj[∥1R∪{e}∥kj−∥1R∥kj]≤wj∥1S∥kj

Note that, function is convex (with respect to ). Therefore,

 ∑e∈S[∥1R∪{e}∥kj−∥1R∥kj]≤∥1R∪S∥kj−∥1R∥kj≤∥1S∥kj

where the last inequality holds due to the triangle inequality of norms. The proposition follows.

#### Beyond convex functions.

Consider the following natural cost functions which represent more practical costs when serving clients as mentioned in the introduction (the cost initially increases fast then becomes more stable before growing quickly again). Let be a non-convex function defined as if or and if where are some constants. The cost function defined as where for every . This function is -locally smooth. Again, it sufficient to verify Inequality (3) and the proof is similar to the one in Proposition 1 (or more specifically, Lemma 6) and note that the derivative of for equals 0.

###### Proposition 3

The algorithm is -competitive for minimizing the non-convex objective function defined above under covering constraints.

#### Submodular functions.

Consider the class of submodular functions , that is for every and and

. Submodular optimization has been extensively studying in optimization and machine learning. In the context of online algorithms,

Buchbinder et al. [17] have considered submodular optimization with preemption, where one can reject previously accepted elements, and have given constant competitive algorithms for unconstrained and knapsack-constraint problems. To the best of our knowledge, the problem of online submodular minimization under covering constraints have not been considered.

An important concept in studying submodular functions is the curvature. Given a submodular function , the total curvature of is defined as [21]

 κf=1−minef(1E)−f(1E∖{e})f(1{e})

Intuitively, the total curvature mesures how far away is from being modular. The concept of curvature has been used to determines both upper and lower bounds on the approximation ratios for many submodular and learning problems [21, 27, 9, 44, 34, 43].

In the following, we present a competitive algorithm for minimizing a monotone submodular function under covering constraints where the competitive ratio is characterized by the curvature of the function (and also the sparsity of the covering matrix). We first look at an useful property of the total curvature.

###### Lemma 4

For any set , it always holds that

 f(1S)≥(1−κf)∑e∈Sf(1{e}).

Proof  Let be an arbitrary subset of . Let for and . We have

 f(1S) ≥f(1E)−f(1E∖S)=m−1∑i=0f(1E∖Si)−f(1E∖Si+1)≥m∑i=1f(1E)−f(1E∖{ei}) ≥(1−κf)m∑i=1f(1ei)

where the first two inequalities is due to submodularity of and the last inequality follows by the definition of the curvature.

###### Proposition 4

The algorithm is -competitive for minimizing monotone submodular function under covering constraints.

Proof  It is sufficient to verify that is -locally smooth. Indeed, the -local smoothness holds due to the submodularity and Lemma 4: for any subset ,

 ∑e∈S[f(1R∪{e})−f(1R)]≤∑e∈S[f(1{e})]≤11−κf⋅f(1S)

Therefore, the proposition follows.

## 4 Conclusion

In this paper, we have presented primal-dual approaches based on configuration linear programs to design competitive algorithms for problems with non-linear/non-convex objectives. Non-convexity until now is considered as a barrier in optimization. We hope that our approach would contribute some elements toward the study of non-linear/non-convex problems. Our work gives raise to several concrete questions related to the online optimization problem under covering constraints. The local-smoothness has provided tight bounds for classes of polynomials with non-negative coefficients and sum of weighted -norms. So a question is to determine tight bounds for different classes of functions. Moreover, is there connection between local-smoothness and total curvature in submodular optimization? Intuitively, both concepts measure how far way a function is from being modular.

#### Acknowledgement.

We thank Nikhil Bansal and Seeun William Umboh for useful discussions that improve the presentation of the paper.

## Appendix A Applications of Theorem 1

### a.1 Minimum Power Survival Network Routing

#### Problem.

In the problem, we are given a graph and requests arrive online. The demand of a request is specified by a source , a sink , the load vector for every edge (link) and an integer number . At the arrival of request , one needs to choose edge-disjoint paths connecting to . Request increases the load for each edge used to satisfy its demand. The load of an edge is defined as the total load incurred by the requests using . The power cost of edge with load is . The objective is to minimize the total power . Typically where and are parameters depending on .

This problems generalizes the Minimum Power Routing problem — a variant in which and — and the Load Balancing problem — a variant in which , all the sources (sinks) are the same () and every path has length 2. For the Minimum Power Routing in offline setting, Andrews et al. [4] gave a polynomial-time poly-log-approximation algorithm. The result has been improved by Makarychev and Sviridenko [37] who gave an -approximation algorithm. In online setting, Gupta et al. [29] presented an -competitive online algorithm. For the Load Balancing problem, the currently best-known approximation is due to [37] via their rounding technique based on decoupling inequality. In online setting, it has been shown that the optimal competitive ratio for the Load Balancing problem is [18].

#### Contribution.

In the problem, the set of strategy for each request is a solution consists of edge-disjoint paths connecting and . Applying the general framework, we deduce the following greedy algorithm.

Let be the load of edge . Initially, set for every edge . At the arrival of request , compute a strategy consisting of edge-disjoint paths from and such that the increase of the total cost is minimum. Select this strategy for request and update .

We notice that computing the strategy for request can be done efficiently. Given the current loads on every edge , create a graph consisting of the same vertices and edges as graph . For each edge in graph , define the capacity to be 1 and the cost on to be . Then the computing edge-disjoint paths from and with the minimal marginal cost in is equivalent to solving a transportation problem in graph .

###### Proposition 5

If the congestion costs of all edges are -smooth then the algorithm is -competitive. In particular, if then the algorithm is -competitive where .

Proof  The proposition follows directly from Theorem 1 and the particular case is derived additionally by Lemma 6.

Note that one can generalizes the problem to capture more general or different types of connectivity demands and the congestion costs are incurred from vertices instead of edges. The same results hold.

### a.2 Online Vector Scheduling

#### Problem.

In the problem, there are unrelated machines and jobs arrive online. The load of a job in machine is specified by a vector where and , a fixed parameter, is the dimension of the vector. At the arrival of a job