 # Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization

We study online convex optimization in a setting where the learner seeks to minimize the sum of a per-round hitting cost and a movement cost which is incurred when changing decisions between rounds. We prove a new lower bound on the competitive ratio of any online algorithm in the setting where the costs are m-strongly convex and the movement costs are the squared ℓ_2 norm. This lower bound shows that no algorithm can achieve a competitive ratio that is o(m^-1/2) as m tends to zero. No existing algorithms have competitive ratios matching this bound, and we show that the state-of-the-art algorithm, Online Balanced Decent (OBD), has a competitive ratio that is Ω(m^-2/3). We additionally propose two new algorithms, Greedy OBD (G-OBD) and Regularized OBD (R-OBD) and prove that both algorithms have an O(m^-1/2) competitive ratio. The result for G-OBD holds when the hitting costs are quasiconvex and the movement costs are the squared ℓ_2 norm, while the result for R-OBD holds when the hitting costs are m-strongly convex and the movement costs are Bregman Divergences. Further, we show that R-OBD simultaneously achieves constant, dimension-free competitive ratio and sublinear regret when hitting costs are strongly convex.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We consider the problem of Smoothed Online Convex Optimization (SOCO), a variant of online convex optimization (OCO) where the online learner pays a movement cost for changing actions between rounds. More precisely, we consider a game where an online learner plays a series of rounds against an adaptive adversary. In each round, the adversary picks a convex cost function and shows it to the learner. After observing the cost function, the learner chooses an action and pays a hitting cost , as well as a movement cost , which penalizes the online learner for switching points between rounds.

SOCO was originally proposed in the context of dynamic power management in data centers lin2012online . Since then it has seen a wealth of applications, from speech animation to management of electric vehicle charging kim2015decision ; joseph2012jointly ; kim2014real , and more recently applications in control goel2017thinking ; goel2018smoothed and power systems li2018using ; badieionline

. SOCO has been widely studied in the machine learning community with the special cases of online logistic regression and smoothed online maximum likelihood estimation receiving recent attention

goel2018smoothed .

Additionally, SOCO has connections to a number of other important problems in online algorithms and learning. Convex Body Chasing (CBC), introduced in friedman1993convex , is a special case of SOCO bubeck2018competitively . The problem of designing competitive algorithms for Convex Body Chasing has attracted much recent attention. e.g. bubeck2018competitively ; bansa2018nested ; argue2019nearly . SOCO can also be viewed as a continuous version of the Metrical Task System (MTS) problem (see borodin1992optimal ; bartal1997polylog ; blum2000line ). A special case of MTS is the celebrated server problem, first proposed in manasse1990competitive , which has received significant attention in recent years (see bubeck2018k ; buchbinder2019k ).

Given these connections, the design and analysis of algorithms for SOCO and related problems has received considerable attention in the last decade. SOCO was first studied in the scalar setting in lin2013dynamic , which used SOCO to model dynamic “right-sizing” in data centers and gave a 3-competitive algorithm. A 2-competitive algorithm was shown in bansal20152 , also in the scalar setting, which matches the lower bound for online algorithms in this setting antoniadis2017tight . Another rich line of work studies how to design competitive algorithms for SOCO when the online algorithm has access to predictions of future cost functions (see lin2012online ; li2018using ; chen2015online ; chen2016using ).

Despite a large and growing literature on SOCO and related problems, for nearly a decade the only known constant-competitive algorithms that did not use predictions of future costs were for one-dimensional action spaces. In fact, the connections between SOCO and Convex Body Chasing highlight that, in general, one cannot expect dimension-free constant competitive algorithms due to a lower bound (see friedman1993convex ; chen2018smoothed ). However, recently there has been considerable progress moving beyond the one-dimensional setting for large, important classes of hitting and movement costs.

A breakthrough came in 2017 when chen2018smoothed proposed a new algorithm, Online Balanced Descent (OBD), and showed that it is constant competitive in all dimensions in the setting where the hitting costs are locally polyhedral and movement costs are the norm. The following year, goel2018smoothed showed that OBD is also constant competitive, specifically -competitive, in the setting where the hitting costs are -strongly convex and the movement costs are the squared norm. Note that this setting is of particular interest because of its importance for online regression and LQR control (see goel2018smoothed ).

While OBD has proven to be a promising new algorithm, at this point it is not known whether OBD is optimal for the competitive ratio, or if there is more room for improvement. This is because there are no non-trivial lower bounds known for important classes of hitting costs, the most prominent of which is the class of strongly convex functions.

Contributions of this paper. In this paper we prove the first non-trivial lower bounds on SOCO with strongly convex hitting costs, both for general algorithms and for OBD specifically. These lower bounds show that OBD is not optimal and there is an order-of-magnitude gap between its performance and the general lower bound. Motivated by this gap and the construction of the lower bounds we present two new algorithms, both variations of OBD, which have competitive ratios that match the lower bound. More specifically, we make four main contributions in this paper.

First, we prove a new lower bound on the performance achievable by any online algorithm in the setting where the hitting costs are -strongly convex and the movement costs are the squared norm. In particular, in Theorem 1, we show that as tends to zero, any online algorithm must have competitive ratio at least .

Second, we show that the state-of-the-art algorithm, OBD, cannot match this lower bound. More precisely, in Theorem 2 we show that, as tends to zero, the competitive ratio of OBD is , an order-of-magnitude higher than the lower bound of . This immediately begs the question: can any online algorithm close the gap and match the lower bound?

Our third contribution answers this question in the affirmative. In Section 4, we propose two novel algorithms, Greedy Online Balanced Descent (G-OBD) and Regularized Online Balanced Descent (R-OBD), which are able to close the gap left open by OBD and match the lower bound. Both algorithms can be viewed as “aggressive" variants of OBD, in the sense that they chase the minimizers of the hitting costs more aggressively than OBD. In Theorem 3 we show that G-OBD matches the lower bound up to constant factors for quasiconvex hitting costs (a more general class than -strongly convex). In Theorem 4 we show that R-OBD has a competitive ratio that precisely matches the lower bound, including the constant factors, and hence can be viewed as an optimal algorithm for SOCO in the setting where the costs are -strongly convex and the movement cost is the squared norm. Further, our results for R-OBD hold not only for squared

movement costs; they also hold for movement costs that are Bregman Divergences, which commonly appear throughout information geometry, probability, and optimization.

Finally, in our last section we move beyond competitive ratio and additionally consider regret. We prove in Theorem 6 that R-OBD can simultaneously achieve bounded, dimension-free competitive ratio and sublinear regret in the case of -strongly convex hitting costs and squared movement costs. This result helps close a crucial gap in the literature. Previous work has shown that it not possible for any algorithm to simultaneously achieve both a constant competitive ratio and sublinear regret in general SOCO problems daniely2019competitive . However, this was shown through the use of linear hitting and movement costs. Thus, the question of whether it is possible to simultaneously achieve a dimension-free, constant competitive ratio and sublinear regret when hitting costs are strongly convex has remained open. The closest previous result is from chen2018smoothed , which showed that OBD can achieve either constant competitive ratio or sublinear regret with locally polyhedral cost functions depending on the “balance condition” used; however both cannot be achieved simultaneously. Our result (Theorem 6), shows that R-OBD can simultaneously provide a constant competitive ratio and sublinear regret for strongly convex cost functions when the movement costs are the squared norm.

## 2 Model & Preliminaries

An instance of Smoothed Online Convex Optimization (SOCO) consists of a convex action set , an initial point , a sequence of non-negative convex cost functions , and a movement cost . In every round, the environment picks a cost function (potentially adversarily) for an online learner. After observing the cost function, the learner chooses an action and pays a cost that is the sum of the hitting cost, , and the movement cost, a.k.a., switching cost, . The goal of the online learner is to minimize its total cost over rounds:

We emphasize that it is the movement costs that make this problem interesting and challenging; if there were no movement costs, , the problem would be trivial, since the learner could always pay the optimal cost simply by picking the action that minimizes the hitting cost in each round, i.e., by setting . The movement cost couples the cost the learner pays across rounds, which means that the optimal action of the learner depends on unknown future costs.

There is a long literature on SOCO, both focusing on algorithmic questions, e.g., goel2018smoothed ; lin2013dynamic ; bansal20152 ; chen2018smoothed , and applications, e.g., kim2015decision ; joseph2012jointly ; kim2014real ; lin2012online . The variety of applications studied means that a variety of assumptions about the movement costs have been considered. Motivated by applications to data center capacity management, movement costs have often been taken as the norm, i.e., , e.g. lin2013dynamic ; bansal20152 . However, recently, more general norms have been considered and the setting of squared movement costs has gained attention due to its use in online regression problems and connections to LQR control, among other applications (see goel2017thinking ; goel2018smoothed ; astrom2010feedback ).

In this paper, we focus on the setting of the squared norm, i.e. ; however, we also consider a generalization of the norm in Section 4.2 where is the Bregman divergence. Specifically, we consider , where both the potential and its Fenchel Conjugate are differentiable. Further, we assume that is -strongly convex and -strongly smooth with respect to an underlying norm . Definitions of each of these properties can be found in the appendix.

Note that the squared norm is itself a Bregman divergence, with and , . However, more generally, when with domain , is the Kullback-Liebler divergence (see bansal2017potential ). Further, is -strongly convex and -strongly smooth in the domain (see chen2018smoothed ). This extension is important given the role Bregman divergence plays across optimization and information theory, e.g., see azizan2018stochastic ; murata2004information .

Like for movement costs, a variety of assumptions have been made about hitting costs. In particular, because of the emergence of pessimistic lower bounds when general convex hitting costs are considered, papers typically have considered restricted classes of functions, e.g., locally polyhedral chen2018smoothed and strongly convex goel2018smoothed . In this paper, we focus on hitting costs that are -strongly convex; however our results in Section 4.1 generalize to the case of quasiconvex functions.

Competitive Ratio and Regret. The primary goal of the SOCO literature is to design online algorithms that (nearly) match the performance of the offline optimal algorithm. The performance metric used to evaluate an algorithm is typically the competitive ratio because the goal is to learn in an environment that is changing dynamically and is potentially adversarial. The competitive ratio is the worst-case ratio of total cost incurred by the online learner and the offline optimal costs. The cost of the offline optimal is defined as the minimal cost an algorithm if it has full knowledge of the sequence of costs , i.e. Using this, the competitive ratio is defined as

Note that another important performance measure of interest is the regret. In this paper, we study a generalization of the classical regret called the -constrained regret, which is defined as follows. The -(constrained) dynamic regret of an online algorithm is if for all sequences of cost functions , we have where is the cost of an -constrained offline optimal solution, i.e., one with movement cost upper bounded by :

As the definitions above highlight, the regret and competitive ratio both compare with the cost of an offline optimal solution, however regret constrains the movement allowed by the offline optimal. The classical notion of regret focuses on the static optimal (), but relaxing that to allow limited movement bridges regret and the competitive ratio since, as grows, the -constrained offline optimal approaches the offline (dynamic) optimal. Intuitively, one can think of regret as being suited for evaluating learning algorithms in (nearly) static settings while the competitive ratio as being suited for evaluating learning algorithms in dynamic settings.

Online Balanced Descent. The state-of-the-art algorithm for SOCO is Online Balanced Descent (OBD). OBD is formally defined in Algorithm 1. Intuitively, it works as follows. In every round, OBD projects the previously chosen point onto a carefully chosen level set of the current cost function . The level set is chosen so that the hitting costs and movement costs are “balanced": in every round, the movement cost is at most a constant times the hitting cost. The balance helps ensure that the online learner is matching the offline costs. Since neither cost is too high, OBD ensures that both are comparable to the offline optimal. The parameter can be tuned to give the optimal competitive ratio and the appropriate level set can be efficiently selected via binary search.

Implicitly, OBD can be viewed as a proximal algorithm with a dynamic step size Boyd14proximal , in the sense that, like proximal algorithms, OBD iteratively projects the previously chosen point onto a level set of the cost function. Unlike traditional proximal algorithms, OBD considers several different level sets, and carefully selects the level set in every round so as to balance the hitting and movement costs. We exploit this connection heavily when designing Regularized OBD (R-OBD), which is a proximal algorithm with a special regularization term added to the objective to help steer the online learner towards the hitting cost minimizer in each round.

OBD was proposed in chen2018smoothed , where the authors show that it has a constant, dimension-free competitive ratio in the setting where the movement costs are the norm and the hitting costs are locally polyhedral, i.e. grow at least linearly away from the minimizer. This was the first time an algorithm had been shown to be constant competitive beyond one-dimensional action spaces. In the same paper, a variation of OBD that uses a different balance condition was proven to have -constrained regret for locally polyhedral hitting costs. OBD has since been shown to also have a constant, dimension-free competitive ratio when movement costs are the squared norm and hitting costs costs are strongly convex, which is the setting we consider in this paper. However, up until this paper, lower bounds for the strongly convex setting did not exist and it was not known whether the performance of OBD in this setting is optimal or if OBD can simultaneously achieve sublinear regret and a constant, dimension-free competitive ratio.

## 3 Lower Bounds

Our first set of results focuses on lower bounding the competitive ratio achievable by online algorithms for SOCO. While chen2018smoothed proves a general lower bound for SOCO showing that the competitive ratio of any online algorithm is , where is the dimension of the action space, there are large classes of important problems where better performance is possible. In particular, when the hitting costs are -strongly convex, goel2018smoothed has shown that OBD provides a dimension-free competitive ratio of . However, no non-trivial lower bounds are known for the strongly convex setting.

Our first result in this section shows a general lower bound on the competitive ratio of SOCO algorithms when the hitting costs are strongly convex and the movement costs are quadratic. Importantly, there is a gap between this bound and the competitive ratio for OBD proven in goel2018smoothed . Our second result further explores this gap. We show a lower bound on the competitive ratio of OBD which highlights that OBD cannot achieve a competitive ratio that matches the general lower bound. This gap, and the construction used to show it, motivate us to propose new variations of OBD in the next section. We then prove that these new algorithms have competitive ratios that match the lower bound.

We begin by stating the first lower bound for strongly convex hitting costs in SOCO.

###### Theorem 1.

Consider hitting cost functions that are -strongly convex with respect to norm and movement costs given by . Any online algorithm must have a competitive ratio at least .

Theorem 1 is proven in the appendix using an argument that leverages the fact that, when the movement cost is quadratic, reaching a target point via one large step is more costly than reaching it by taking many small steps. More concretely, to prove the lower bound we consider a scenario on the real line where the online algorithm encounters a sequence of cost functions whose minimizers are at zero followed by a very steep cost function whose minimizer is at . Without knowledge of the future, the algorithm has no incentive to move away from zero until the last step, when it is forced to incur a large cost; however, the offline adversary, with full knowledge of the cost sequence, can divide the journey into multiple small steps.

Importantly, the lower bound in Theorem 1 highlights the dependence of the competitive ratio on , the convexity parameter. It shows that the case where online algorithms do the worst is when is small, and that algorithms that match the lower bound up to a constant are those for which the competitive ratio is as . Note that our results in Section 4 show that there exists online algorithms that precisely achieve the competitive ratio in Theorem 1. However, in contrast, the following shows that OBD cannot match the lower bound in Theorem 1.

###### Theorem 2.

Consider hitting cost functions that are -strongly convex with respect to norm and a movement costs given by . The competitive ratio of OBD is as , for any fixed .

As we have discussed, OBD is the state-of-the-art algorithm for SOCO, and has been shown to provide a competitive ratio of goel2018smoothed . However, Theorem 2 highlights a gap between OBD and the general lower bound. If the lower bound is achievable (which we prove it is in the next section), this implies that OBD is a sub-optimal algorithm.

The proof of Theorem 2 gives important intuition about what goes wrong with OBD and how the algorithm can be improved. Specifically, our proof of Theorem 2 considers a scenario where the cost functions have minimizers very near each other, but OBD takes a series of steps without approaching the minimizing points. The optimal is able to pay little cost and stay near the minimizers, but OBD never moves enough to be close to the minimizers. Figure 1 illustrates the construction, showing OBD moving along the circumference of a circle, while the offline optimal stays near the origin. Figure 1: Counterexample used to prove Theorem 2. In the figure, {xt} are the choices of OBD and {x∗t} are the choices of the offline optimal.

## 4 Algorithms

The lower bounds in Theorem 1 and Theorem 2 suggest a gap between the competitive ratio of OBD and what is achievable via an online algorithm. Further, the construction used in the proof of Theorem 2 highlights the core issue that leads to inefficiency in OBD. In the construction, OBD takes a large step from to , but the offline optimal, , only decreases by a very small amount. This means that OBD is continually chasing the offline optimal but never closing the gap. In this section, we take inspiration from this example and develop two new algorithms that build on OBD but ensure that the gap to the offline optimal shrinks.

How to ensure that the gap to the offline optimal shrinks is not obvious since, without the knowledge about the future, it is impossible to determine how will evolve. A natural idea is to determine an online estimate of and then move towards that estimate. Motivated by the construction in the proof of Theorem 2, we use the minimizer of the hitting cost at round , , as a rough estimate of the offline optimal and ensure that we close the gap to in each round.

There are a number of ways of implementing the goal of ensuring that OBD more aggressively moves toward the minimizer of the hitting cost each round. In this section, we consider two concrete approaches, each of which (nearly) matches the lower bound in Theorem 1.

The first approach, which we term Greedy OBD (Algorithm 2) is a two-stage algorithm, where the first stage applies OBD and then a second stage explicitly takes a step directly towards the minimizer (of carefully chosen size). We introduce the algorithm and analyze its performance in Section 4.1. Greedy OBD is order-optimal, i.e. matches the lower bound up to constant factors, in the setting of squared norm movement costs and quasiconvex hitting costs.

The second approach for ensuring that OBD moves aggressively toward the minimizer uses a different view of OBD. In particular, Greedy OBD uses a geometric view of OBD, which is the way OBD has been presented previously in the literature. Our second view uses a “local view” of OBD that parallels the local view of gradient descent and mirror descent, e.g., see bansal2017potential ; hazan2016introduction . In particular, the choice of an action in OBD can be viewed as the solution to a per-round local optimization. Given this view, we ensure that OBD more aggressively tracks the minimizer by adding a regularization term to this local optimization which penalizes points which are far from the minimizer. We term this approach Regularized OBD (Algorithm 3), and study it in Section 4.2. Note that Regularized OBD has a competitive ratio that precisely matches the lower bound, including the constant factors, when movement costs are Bregman divergences and hitting costs are -strongly convex. Thus, it applies for more general movement costs than Greedy OBD but less general hitting costs.

### 4.1 Greedy OBD

The formal description of Greedy Online Balanced Descent (G-OBD) is given in Algorithm 2. G-OBD has two steps each round. First, the algorithm takes a standard OBD step from the previous point to a new point , which is the projection of onto a level set of the current hitting cost , where the level set is chosen to balance hitting and movement costs. G-OBD then takes an additional step directly towards the minimizer of the hitting cost, , with the size of the step chosen based on the convexity parameter . G-OBD can be implemented efficiently using the same approach as described for OBD chen2018smoothed . G-OBD has two parameters and . The first, , is the balance parameter in OBD and the second, , is a parameter controlling the size of the step towards the minimizer . Note that the two-step approach of G-OBD is reminiscent of the two-stage algorithm used in bienkowski2018better ; however the resulting algorithms are quite distinct.

While the addition of a second step in G-OBD may seem like a small change, it improves performance by an order-of-magnitude. We prove that G-OBD asymptotically matches the lower bound proven in Theorem 2 not just for -strongly convex hitting costs, but more broadly to quasiconvex costs.

###### Theorem 3.

Consider quasiconvex hitting costs such that and movement costs . G-OBD with is an -competitive algorithm as .

### 4.2 Regularized OBD

The G-OBD framework is based on the geometric view of OBD used previously in literature. There are, however, two limitations to this approach. First, the competitive ratio obtained, while having optimal asymptotic dependence on , does not not match the constants in the lower bound of Theorem 1. Second, G-OBD requires repeated projections, which makes efficient implementation of challenging when the functions have complex geometry.

Here, we present a variation of OBD based on a local view that overcomes these limitations. Regularized OBD (R-OBD) is computationally simpler and provides a competitive ratio that matches the constant factors in the lower bound in Theorem 1. However, unlike G-OBD, our analysis of R-OBD does not apply to quasiconvex hitting costs. R-OBD is described formally in Algorithm 3. In each round, R-OBD picks a point that minimizes a weighted sum of the hitting and movement costs, as well as a regularization term which encourages the algorithm to pick points close to the minimizer of the current hitting cost function, . Thus, R-OBD can be implemented efficiently using two invocations of a convex solver. Note that R-OBD has two parameters and which adjust the weights of the movement cost and regularizer respectively.

While it may not be immediately clear how R-OBD connects to OBD, it is straightforward to illustrate the connection in the squared setting. In this case, computing is equivalent to doing a projection onto a level set of , since the selection of the minimizer can be restated as the solution to . Thus, without the regularizer, the optimization in R-OBD gives a local view of OBD and then the regularizer provides more aggressive movement toward the minimizer of the hitting cost.

Not only does the local view lead to a computationally simpler algorithm, but we prove that R-OBD matches the constant factors in Theorem 1 precisely, not just asymptotically. Further, it does this not just in the setting where movement costs are the squared norm, but also in the case where movement costs are Bregman divergences.

###### Theorem 4.

Consider hitting costs that are strongly convex with respect to a norm and movement costs defined as , where is -strongly convex and -strongly smooth with respect to the same norm. Additionally, assume and its Fenchel Conjugate are differentiable. Then, R-OBD with parameters and has a competitive ratio of If and satisfy then the competitive ratio is

Theorem 4 focuses on movement costs that are Bregman divergences, which generalizes the case of squared movement costs. To recover the squared case, we use and , which results in a competitive ratio of . This competitive ratio matches exactly with the lower bound claimed in Theorem 1. Further, in this case the assumption in Theorem 4 that the hitting cost functions are differentiable is not required (see Theorem 7 in the appendix).

It is also interesting to investigate the settings of and that yield the optimal competitive ratio. Surprisingly, setting achieves the optimal competitive ratio as long as . By restating the update rule in R-OBD as , we see that R-OBD with can be interpreted as “one step lookahead mirror descent”. Further R-OBD with can be implemented even when we do not know the location of the minimizer . For example, when , we can run gradient descent starting at to minimize the strongly convex function . Only local gradients will be queried in this process. However, the following lower bound highlights that this simple form comes at some cost in terms of generality when compared with our results for G-OBD.

###### Theorem 5.

Consider quasiconvex hitting costs such that and movement costs given by . Regularized OBD has a competitive ratio of when .

## 5 Balancing Regret and Competitive Ratio

In the previous sections we have focused on the competitive ratio; however another important performance measure is regret. In this section, we consider the -constrained dynamic regret. The motivation for our study is daniely2019competitive , which provides an impossibility result showing that no algorithm can simultaneously maintain a constant competitive ratio and a sub-linear regret in the general setting of SOCO. However, daniely2019competitive utilizes linear hitting costs in its construction and thus it is an open question as to whether this impossibility result holds for strongly convex hitting costs. In this section, we show that the impossibility result does not hold for strongly convex hitting costs. To show this, we first characterize the parameters for which R-OBD gives sublinear regret.

###### Theorem 6.

Consider hitting costs that are strongly convex with respect to a norm and movement costs defined as , where is -strongly convex and -strongly smooth with respect to the same norm. Additionally, assume and its Fenchel Conjugate are differentiable. Further, suppose that is bounded above by , the diameter of the feasible set is bounded above by , and . Then, for such that and , where is such that , the -constrained regret of R-OBD is .

Theorem 6 highlights that regret can be achieved when and for some constant . This suggests that the tendency to aggressively move towards the minimizer should shrink over time in order to achieve a small regret. It is not possible to use Theorem 6 to simultaneously achieve the optimal competitive ratio and regret for all strongly convex hitting costs (). However, the corollary below shows that it is possible to simultaneously achieve a dimension-free, constant competitive ratio and an regret for all . An interesting open question that remains is whether it is possible to develop an algorithm that has sublinear regret and matches the optimal order for competitive ratio.

###### Corollary 1.

Consider the same conditions as in Theorem 6 and fix . R-OBD with parameters has an regret and is -competitive.

## References

•  A. Antoniadis and K. Schewior. A tight lower bound for online convex optimization with switching costs. In Proceedings of the International Workshop on Approximation and Online Algorithms, pages 164–175. Springer, 2017.
•  C. Argue, S. Bubeck, M. B. Cohen, A. Gupta, and Y. T. Lee. A nearly-linear bound for chasing nested convex bodies. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 117–122, 2019.
•  K. J. Aström and R. M. Murray. Feedback systems: an introduction for scientists and engineers. Princeton university press, 2010.
•  N. Azizan and B. Hassibi. Stochastic gradient/mirror descent: Minimax optimality and implicit regularization. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
•  M. Badiei, N. Li, and A. Wierman. Online convex optimization with ramp constraints. In IEEE Conference on Decision and Control (CDC), pages 6730–6736, 2015.
•  N. Bansal, M. Böhm, M. Eliáš, G. Koumoutsos, and S. W. Umboh. Nested convex bodies are chaseable. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1253–1260, 2018.
•  N. Bansal and A. Gupta. Potential-function proofs for first-order methods. arXiv preprint arXiv:1712.04581, 2017.
•  N. Bansal, A. Gupta, R. Krishnaswamy, K. Pruhs, K. Schewior, and C. Stein. A 2-competitive algorithm for online convex optimization with switching costs. In

Proceedings of the Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM)

. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
•  Y. Bartal, A. Blum, C. Burch, and A. Tomkins. A polylog(n)-competitive algorithm for metrical task systems. In

Proceedings of the ACM Symposium on Theory of Computing (STOC)

, pages 711–719, 1997.
•  M. Bienkowski, J. Byrka, M. Chrobak, C. Coester, L. Jez, and E. Koutsoupias. Better bounds for online line chasing. arXiv preprint arXiv:1811.09233, 2018.
•  A. Blum and C. Burch. On-line learning and the metrical task system problem. Machine Learning, 39(1):35–58, 2000.
•  A. Borodin, N. Linial, and M. E. Saks. An optimal on-line algorithm for metrical task system. Journal of the ACM, 39(4):745–763, 1992.
•  S. Bubeck, M. B. Cohen, Y. T. Lee, J. R. Lee, and A. Mądry. k-server via multiscale entropic regularization. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), pages 3–16, 2018.
•  S. Bubeck, Y. T. Lee, Y. Li, and M. Sellke. Competitively chasing convex bodies. In Proceedings of the ACM SIGACT Symposium on Theory of Computing (STOC), 2019.
•  N. Buchbinder, A. Gupta, M. Molinaro, and J. Naor. k-servers with a smile: online algorithms via projections. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 98–116, 2019.
•  N. Chen, A. Agarwal, A. Wierman, S. Barman, and L. L. Andrew. Online convex optimization using predictions. ACM SIGMETRICS Performance Evaluation Review, 43(1):191–204, 2015.
•  N. Chen, J. Comden, Z. Liu, A. Gandhi, and A. Wierman. Using predictions in online optimization: Looking forward with an eye on the past. ACM SIGMETRICS Performance Evaluation Review, 44(1):193–206, 2016.
•  N. Chen, G. Goel, and A. Wierman. Smoothed online convex optimization in high dimensions via online balanced descent. In Proceedings of Conference On Learning Theory (COLT), pages 1574–1594, 2018.
•  A. Daniely and Y. Mansour. Competitive ratio vs regret minimization: achieving the best of both worlds. In Proceedings of Algorithmic Learning Theory, pages 333–368, 2019.
•  J. Friedman and N. Linial. On convex body chasing. Discrete & Computational Geometry, 9(3):293–321, 1993.
•  G. Goel, N. Chen, and A. Wierman. Thinking fast and slow: Optimization decomposition across timescales. In Proceedings of the IEEE Conference on Decision and Control (CDC), pages 1291–1298, 2017.
•  G. Goel and A. Wierman. An online algorithm for smoothed regression and LQR control. In Proceedings of the Machine Learning Research, volume 89, pages 2504–2513, 2019.
•  E. Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016.
•  V. Joseph and G. de Veciana. Jointly optimizing multi-user rate adaptation for video transport over wireless systems: Mean-fairness-variability tradeoffs. In Proceedings of the IEEE INFOCOM, pages 567–575, 2012.
•  S. Kakade, S. Shalev-Shwartz, and A. Tewari. On the duality of strong convexity and strong smoothness: Learning applications and matrix regularization. Unpublished Manuscript, http://ttic.uchicago.edu/%7eshai/papers/KakadeShalevTewari09.pdf, 2009.
•  S. Kim and G. B. Giannakis. An online convex optimization approach to real-time energy pricing for demand response. IEEE Transactions on Smart Grid, 8(6):2784–2793, 2017.
•  T. Kim, Y. Yue, S. Taylor, and I. Matthews.

A decision tree framework for spatiotemporal sequence prediction.

In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 577–586, 2015.
•  Y. Li, G. Qu, and N. Li. Using predictions in online optimization with switching costs: A fast algorithm and a fundamental limit. In Proceedings of the American Control Conference (ACC), pages 3008–3013. IEEE, 2018.
•  M. Lin, Z. Liu, A. Wierman, and L. L. Andrew. Online algorithms for geographical load balancing. In Proceedings of the International Green Computing Conference (IGCC), pages 1–10, 2012.
•  M. Lin, A. Wierman, L. L. Andrew, and E. Thereska. Dynamic right-sizing for power-proportional data centers. IEEE/ACM Transactions on Networking (TON), 21(5):1378–1391, 2013.
•  M. S. Manasse, L. A. McGeoch, and D. D. Sleator. Competitive algorithms for server problems. Journal of Algorithms, 11(2):208–230, 1990.
•  N. Murata, T. Takenouchi, T. Kanamori, and S. Eguchi. Information geometry of u-boost and Bregman divergence. Neural Computation, 16(7):1437–1481, 2004.
•  N. Parikh and S. Boyd. Proximal algorithms. Foundations and Trends in Optimization, 1(3):127–239, 2014.
•  S. Shalev-Shwartz and Y. Singer. On the equivalence of weak learnability and linear separability: New relaxations and efficient boosting algorithms. Machine learning, 80(2-3):141–163, 2010.

## Appendix A Proof of Theorem 1

We consider a sequence of hitting cost functions on the real line such that the algorithm stays at the starting point through time steps and is forced to incur a huge movement cost at time step , whereas the offline adversary can pay relatively little cost by dividing the long trek between and into multiple small steps through time steps .

Specifically, suppose the starting point of the algorithm and the offline adversary is , and the hitting cost functions are

 ft(x)={m2x2t∈{1,2,⋯,n}m′2(x−1)2t=n+1

for some large parameter that we choose later.

Suppose the algorithm first moves at time step . If , we stop the game at time step and compare the algorithm with an offline adversary which always stays at . The total cost of offline adversary is 0, but the total cost of the algorithm is non-zero. So, the competitive ratio is unbounded.

Next we consider the case where . This implies that and is some non-zero point, say . We see that the cost incurred by the online algorithm is

 cost(ALG)≥minxn+1(Mn+1+Hn+1)=minx(12x2+m′2(x−1)2).

Notice that the right hand side tends to as tends to infinity; specifically, we have

 cost(ALG)≥minx(12x2+m′2(x−1)2)=12(1+1m′). (3)

Now let us consider the offline optimal. Notice that, in the limit as tends to infinity, the offline optimal must satisfy and ; otherwise it would incur unbounded cost. Our lower bound is derived by considering the case when and so we constrain the adversary to satisfy the above, knowing that the adversary is not optimal for finite , i.e., with as .

Let the sequence of points the adversary chooses as . We compute the cost incurred by the adversary as follows where, to simplify presentation, we define to be the set .

 an =2minx∗∈K(n,1)n+1∑i=1(H∗i+M∗i) =2minx∗∈K(n,1)(n∑i=1m2(x∗i)2+n+1∑i=112(x∗i−x∗i−1)2).

In words, is twice the minimal offline cost subject to the constraints . We derive the limiting behavior of the offline costs as in the following lemma.

###### Lemma 7.

For , define

 an=2minx∗∈K(n,1)(n∑i=1m2(x∗i)2+n+1∑i=112(x∗i−x∗i−1)2).

Then we have .

Given the lemma, the total cost of the offline adversary will be . Finally, applying (3), we know and ,

 cost(ALG)cost(ADV)≥12(1+1m′)an2=1(1+1m′)an.

By taking the limit and and using Lemma 7, we obtain

 cost(ALG)cost(OPT)=limn,m′→∞cost(ALG)cost(ADV)≥(−m+√m2+4m2)−1=1+√1+4m2.

All that remains is to prove Lemma 7, which describes the cost of the offline adversary in the limit as tends to infinity.

###### Proof of Lemma 7.

Using the fact that the costs are all homogeneous of degree 2, we see that for all , we have

 minx∗∈K(n,y)(n∑i=1m2(x∗i)2+n+1∑i=112(x∗i−x∗i−1)2) (4) = y2minx∗∈K(n,1)(n∑i=1m2(x∗i)2+n+1∑i=112(x∗i−x∗i−1)2).

The sequence has a recursive relationship as follows:

 an+1 =2minx∗∈K(n+1,1)(n+1∑i=1m2(x∗i)2+n+2∑i=112(x∗i−x∗i−1)2) (5) +m2x2+12(1−x)2) =2min0≤x≤1(x2minx∗∈K(n,1)(n∑i=1m2(x∗i)2+n+1∑i=112(x∗i−x∗i−1)2) +m2x2+12(1−x)2) =2min0≤x≤1(an2x2+m2x2+12(1−x)2) =an+man+m+1.

Solving the equation , we find the two fixed points of the recursive relationship are

 x1=−m+√m2+4m2,

and

 x2=−m−√m2+4m2.

Notice that for , we have

 m−(m+1)xi=−(1−xi)xi.

Using this property, we obtain

 an+1−x1=an+man+m+1−x1=(1−x1)an+m−(m+1)x1an+m+1=(1−x1)(an−x1)an+m+1, (6)

and

 an+1−x2=an+man+m+1−x2=(1−x2)an+m−(m+1)x2an+m+1=(1−x2)(an−x2)an+m+1. (7)

Notice that . By dividing equations (6) and (7), we obtain

 (an+1−x1an+1−x2)=1−x11−x2⋅(an−x1an−x2),∀n≥0.

Remember that . Therefore we have

 (an−x1an−x2)=(1−x11−x2)n(a0−x1a0−x2)=(1−x11−x2)n+1.

Rearranging this equation, we get

 an=(1−(1−x11−x2)n+1)−1(x1−x2⋅(1−x11−x2)n+1).

Since , we have

 limn→∞an=x1=−m+√m2+4m2. (8)

## Appendix B Proof of Theorem 2

Our proof of Theorem 2 relies on a set of technical lemmas, which follow. Lemma 8 and Lemma 10 work together to establish a lower bound on the competitive ratio as tends to zero when the balance parameter is set to be , while Lemma 11 lower bound on the competitive ratio as tends to zero when the balance parameter is set to be .

###### Lemma 8.

If , the competitive ratio of OBD is when .

###### Proof.

Our approach is to construct a scenario where OBD is forced to move along the circumference of a large circle, but the offline adversary moves along the circumference of a much smaller circle (see Figure 1). The adversary is hence able to pay much smaller movements costs, forcing the competitive ratio to be large.

We propose a series of costs which force OBD to move in a circle. The idea is to construct a cost function so that, at the end of every round, the relative positions of the OBD algorithm, the offline adversary, and the minimizer are fixed. Since OBD is memoryless, we can simply input this function arbitrarily many times and the positions of OBD and the offline adversary will trace out a pair of concentric circles (see Figure 1).

Suppose that, at the start of a round, OBD is at the point . Let be the distance between OBD and the adversary. Consider a right triangle such that , the offline adversary is at some point on the hypotenuse and (see Figure 2). Let us introduce a coordinate system such that the origin lies at , the -axis contains and the -axis is parallel to , such that the positive part of the axis lies on the same side of as the segment . Our goal is to construct a cost function which forces OBD towards . This will preserve the relative positions of OBD and the adversary, since we assumed that they were a distance away at the start of the round. Consider the costs , where is the distance from the point to the line passing through and and is a parameter we will pick later. Define . Notice that is -strongly convex because it is the sum of an -strongly convex function and a convex function. Intuitively, when is large, the function is infinity outside of the line but is equal to when restricted to points on the line. After observing the cost , OBD will pick some new point .

The following lemma highlights that can be driven arbitrarily close to by taking to be sufficiently large.

###### Lemma 9.

Let , and suppose is picked to that . Then the point picked by OBD satisfies . Figure 2: In the right triangle △ABC, ∠ABC=90o,|BC|=ℓ,|AB|=h=√γmℓ. Point D is on the line segment AC such that |AD|=ℓ. OBD starts at point A and selects point E. The offline adversary starts at point D and selects point F. G is the projection point of E on line segment AB.

We instruct the adversary to pick the point on the line (the -axis) such that (see Figure 2). Notice that , where we used the triangle inequality. Let . We see that the total cost incurred by the offline adversary is

 M∗t+H∗t=12|DF|2+m2|CF|2≤12(|DC|+|CF|)2+m2|CF|2≤12(z+ε)2+mε22,

where we applied the triangle inequality.

Notice that by the Pythagorean theorem (recall that is a right triangle). Since and , we see that . Hence the movement cost incurred by the OBD is

 Mt≥12(h−ε)2=12(√2zℓ+z2−ε)2.

Hence the ratio of the costs is

 Mt+HtM∗t+H∗t≥MtM∗t+H∗t≥12(√2zℓ+z2−ε)212(z+ε)2+mε22.

Since the limit of this expression as is , for sufficiently small this will be at least . Since and , the ratio of costs is at least

 ℓ√γmℓ2+ℓ2−ℓ=1√γm+1−1=√γm+1+1γm≥2γm.

Now, we describe the whole process. When , the hitting cost function is . While OBD stays at , the adversary moves to the point ; it incurs a one-time cost of . On all subsequent steps , we repeatedly apply the construction, which forces OBD to move in a circle. The one-time cost incurred by the adversary to setup the game is negligible in the limit as is large, and the per-round ratio of costs is , so the competitive ratio is also as claimed. ∎

The key technical lemma used in the proof is Lemma 9, and we now provide a proof of that result.

###### Proof of Lemma 9.

Suppose . We first show that OBD selects the point strictly contained by the -level set, which is the one lies on. First observe that the point satisfies the balance condition: , because we constructed so that and . However, the point is not necessarily a projection of onto any level set of . If OBD projected onto the level set which lies on, it would incur less cost than if it moved to ; however then the balance condition would be violated. To restore the balance condition, we must increase the movement cost while decreasing the hitting cost – which means we must move to a strictly smaller level set, say the -level set, where .

Let denote the -coordinate of , using the coordinate system we define in the proof of Lemma 8. Notice that , since was defined to be the vertical distance to the -axis times . Since , we see that , where we used the fact that lies on the level set and . By the balance condition,