A Local Regret in Nonconvex Online Learning

11/13/2018 ∙ by Sergul Aydore, et al. ∙ Amazon Stevens Institute of Technology 0

We consider an online learning process to forecast a sequence of outcomes for nonconvex models. A typical measure to evaluate online learning algorithms is regret but such standard definition of regret is intractable for nonconvex models even in offline settings. Hence, gradient based definition of regrets are common for both offline and online nonconvex problems. Recently, a notion of local gradient based regret was introduced. Inspired by the concept of calibration and a local gradient based regret, we introduce another definition of regret and we discuss why our definition is more interpretable for forecasting problems. We also provide bound analysis for our regret under certain assumptions.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In typical forecasting problems, we make probabilistic estimates of future outcomes based on the previous observations. Recently, it has been shown that forecasting models can be complex nonconvex models

Flunkert et al. (2017); Wen et al. (2017). Frequent update of these models is desired as the relationship between the targets and outputs might change over time. However, re-training these models can be time consuming.

Online learning is a method of updating the model on each pattern as it is observed as opposed to batch learning where the training is performed over groups of pattern. It is a common technique to dynamically adapt to new patterns in the data or when training over the entire data set is infeasible. The literature in online learning is rich with interesting theoretical and practical applications but it is usually limited to the convex problems where global optimization is computationally tractable Zinkevich (2003). On the other hand, it is NP-hard to compute the global minimum of nonconvex functions over a convex domain Hazan et al. (2017); Hsu et al. (2012).

Due to the intractability of the nonconvex problems, various assumptions on the input have been used to design polynomial-time algorithms Arora et al. (2014); Hsu et al. (2012). However, these were too specific to the models and more generic approach was needed. One way to achieve this is by replacing the “global optimality” requirement with a more modest requirement of stationarity Allen-Zhu and Hazan (2016).

The idea of online learning was borrowed from game theory where an online player answers a sequence of questions. The true answers to the questions are unknown to the player at the time of each decision and the player suffers a loss after committing to a decision. These losses are unknown to the player and the performance of the sequence of decisions will be evaluated by the difference between this accumulated loss and the best fixed decision in hindsight. Most recently,

Hazan et al. (2017) proposed a notion of gradient based local regret for nonconvex games.

Inspired by Hazan’s approach and incorporating the notion of calibration, we introduce a novel gradient based local regret for forecasting problems. Calibration is a well-studied concept in forecasting Foster and Vohra (1998). From game theoretic point of view, we call a forecasting procedure “calibrated” if the forecasts are consistent in hindsight. To the best of our knowledge, such definition of regret is new. We show that the proposed regret has logarithmic bound under certain circumstances and we provide insights to the proposed regret. We conjecture that more efficient algorithms can be developed that minimizes our regret.

2 Setting

In online forecasting, our goal is to update at each in order to incorporate the most recently available information. Assume that represents a collection of consecutive points where is an integer and represents an initial forecast point.

are nonconvex loss functions on some convex subset

. To put in another way,

represents the parameters of a machine learning model at time

, represents the loss function computed using the available data at time given the model parameters .

2.1 Regret Analysis

The performance of online learning algorithms is commonly evaluated by the regret, which is defined as the difference between the real cumulative loss and the minimum cumulative loss across :


If the regret grows linearly with , it can be concluded that the player is not learning. If, on the other hand, the regret grows sub-linearly, the player is learning and its accuracy is improving. While such definition of regret makes sense for convex optimization problems, it is not appropriate for nonconvex problems, due to NP-hardness of nonconvex global optimization even in offline settings. Indeed, most research on nonconvex problems focuses on finding local optima. In literature on nonconvex optimization algorithms, it is common to use the magnitude of the gradient to analyze convergence. Hazan et al. (2017) introduced a local regret measure - a new notion of regret that quantifies the objective of predicting points with small gradients on average. At each round of the game, the gradients of the loss functions from where most recent rounds of play are evaluated at the forecast, and these gradients are then averaged. Hazan et al. (2017)’s local regret is defined to be the sum of the squared magnitude of the gradients averages.

Definition 2.1.

(Hazan’s local regret) The -local regret of an online algorithm is defined as:


when and . Hazan et al. (2017) proposed various gradient descent algorithms where the regret is sublinear.

2.2 Proposed Local Regret

In order to introduce the concept of calibration Foster and Vohra (1998), let’s consider the first order Taylor series expansion of the cumulative loss:


where for any . If the forecasts are well-calibrated, then perturbing by any cannot substantially reduce the cumulative loss. Hence, we can say that the sequence is asymptotically calibrated with respect to , if:

Definition 2.2.

(Proposed Regret) We propose a -local regret as:


where for . To motivate equation 5, we use the following equality:


which holds for the interior points. Using our definition of regret, we effectively evaluate an online learning algorithm by computing the average of losses at the corresponding forecast values over a sliding window. Hazan et al. (2017)’s local regret, on the other hand, computes average of previous losses computed on the most recent forecast. We believe that our definition of regret is more applicable to forecasting problems as evaluating today’s forecast on previous loss functions might be misleading.

3 Bound Analysis

We provide bound for different scenarios for the proposed regret in equation 5 for the interior points in the feasible set with the following assumptions: ; ; parameter update at is: where is the learning rate for some small . We consider three scenarios: (i) , is constant and , (ii) and , (iii) and is constant. We also note the following Theorem whose proof is provided in section 5.1.

Theorem 3.1.

where .

3.1 Scenario 1: , is constant and

Since , the update rule becomes ; in other words, no projection operator is necessary. Hence we can write:


as a unit vector such that

, we can write . Hence; the bound for the proposed regret becomes:


which can be made sublinear in if is selected large enough.

3.2 Scenario 2: and

Assuming is interior of the feasible set for all and and setting , we can write the result in theorem 3.1 as:


where is set to . Hence, we get:


Summing this over yields:


which concludes the logarithmic bound for the proposed regret for interior points when and .

3.3 Scenario 3: and is constant

Similar to 3.2, we can write:


Summing this result across yields:


which is quadratic in but can be selected accordingly to make the upper bound sub-linear.

4 Conclusion

We introduced a new definition of a local regret to study nonconvex problems in forecasting. We used the concept of a calibration and showed that our regret can be written as a local regret for the interior points in the feasible set. Our regret differs from Hazan’s regret in the sense that it emphasizes today’s reward as opposed to past reward. We also showed that our definition of regret has a logarithmic bound under some constraints. As a future direction, we plan to study the insights of our regret for the boundary points in the feasible set and propose efficient machine learning algorithms for nonconvex online learning that are optimal in terms of our definition of regret.


5 Appendix

Lemma 5.1.

where , for any such that .


Let and recall that . Then we have:


The inequality in 17 can be justified by geometrical interpretation of projections as shown in Figure 1.

Figure 1: (a) Geometrical justification for inequality 17. The angle between and is always less than or equal to ; hence for all . (b) Due to the triangle inequality, . Hence .

Plugging , we have:


Inequality 18 is a result of triangle inequality as drawn in Figure 1. Using the fact that in equation 18 , we can write:


where equation 20 is a result of . By rewriting as , we get:


Note that by replacing with and with in Figure 1, we can see that . Since , we get:


Proof of Theorem 3.1 :
As a result of lemma 5.1, we can write the following inequality:

The first term can be rewritten as


The bound for the second term can be written as:


as a result of . The bound for the third term can be rewritten as:


where equation 5 is a result of . Hence, we have:


now, let’s explore the bound for for any . By definition of , we can write:


Hence, . Taking and combining 32 and 37, we get: