Optimal Methods for Higher-Order Smooth Monotone Variational Inequalities

In this work, we present new simple and optimal algorithms for solving the variational inequality (VI) problem for p^th-order smooth, monotone operators – a problem that generalizes convex optimization and saddle-point problems. Recent works (Bullins and Lai (2020), Lin and Jordan (2021), Jiang and Mokhtari (2022)) present methods that achieve a rate of Õ(ϵ^-2/(p+1)) for p≥ 1, extending results by (Nemirovski (2004)) and (Monteiro and Svaiter (2012)) for p=1,2. A drawback to these approaches, however, is their reliance on a line search scheme. We provide the first p^th-order method that achieves a rate of O(ϵ^-2/(p+1)). Our method does not rely on a line search routine, thereby improving upon previous rates by a logarithmic factor. Building on the Mirror Prox method of Nemirovski (2004), our algorithm works even in the constrained, non-Euclidean setting. Furthermore, we prove the optimality of our algorithm by constructing matching lower bounds. These are the first lower bounds for smooth MVIs beyond convex optimization for p > 1. This establishes a separation between solving smooth MVIs and smooth convex optimization, and settles the oracle complexity of solving p^th-order smooth MVIs.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

06/04/2019

Higher-Order Accelerated Methods for Faster Non-Smooth Optimization

We provide improved convergence rates for various non-smooth optimizatio...
02/22/2020

Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives

In this paper, we revisit the problem of private stochastic convex optim...
05/06/2022

Perseus: A Simple High-Order Regularization Method for Variational Inequalities

This paper settles an open and challenging question pertaining to the de...
11/14/2019

Gradientless Descent: High-Dimensional Zeroth-Order Optimization

Zeroth-order optimization is the process of minimizing an objective f(x)...
10/15/2020

Adaptive and Universal Single-gradient Algorithms for Variational Inequalities

Variational inequalities with monotone operators capture many problems o...
11/23/2020

Geometry-Aware Universal Mirror-Prox

Mirror-prox (MP) is a well-known algorithm to solve variational inequali...
02/05/2019

A Universal Algorithm for Variational Inequalities Adaptive to Smoothness and Noise

We consider variational inequalities coming from monotone operators, a s...

1 Introduction

We consider the problem of monotone variational inequalities (MVI), a well-studied setting which captures convex optimization and convex-concave saddle-point problems, among others. While convex optimization arises as a special case when choosing the gradient operator of the function, the MVI framework allows for more general operators which go beyond the scope of convex problems. There has been particular interest in developing methods for smooth MVIs, similar to the case of smooth convex optimization. The Mirror Prox method of Nemirovski (2004) for MVIs, which generalizes the extragradient method (Korpelevich, 1976), achieves an iteration complexity. Later work by Nesterov (2007) arrives at the same iteration complexity, though it extends the extragradient approach to the dual space, resulting in what is known as the dual extrapolation method. Furthermore, it has been shown that such rates are tight, under first-order oracle access, for smooth convex-concave saddle point problems (Ouyang and Xu, 2021), which are a special case of the MVI problem.

In addition, several works have studied MVIs in contexts beyond standard smoothness conditions. Nesterov (2006) considers a second-order approach inspired by the cubic regularization method (Nesterov and Polyak, 2006), for MVIs where the Jacobian of the operator is Lipschitz continuous (referred to as second-order smoothness, i.e. ), and achieves an rate. Under the same assumption of second-order smoothness, Monteiro and Svaiter (2012) show how to achieve an improved convergence rate of . Their approach extends their previous work (Monteiro and Svaiter, 2010) on the hybrid proximal extragradient (HPE) method, which has garnered much interest in recent years in the context of faster accelerated methods for higher-order smooth convex optimization (Monteiro and Svaiter, 2013; Gasnikov et al., 2019; Bullins, 2020) that are near-optimal in terms of higher-order oracle complexity (Agarwal and Hazan, 2018; Arjevani et al., 2019).

These results have since been generalized to -order smooth MVIs (Bullins and Lai, 2020; Jiang and Mokhtari, 2022), whereby they have established rates of the form . However, a drawback for these works, as well as for Monteiro and Svaiter (2012), is that they have all required an additional line search procedure. In this note, we provide a simple method and analysis, inspired by the Mirror Prox method of Nemirovski (2004), which achieves the rate without requiring a binary search.

Independent work by Lin and Jordan (2022).

Concurrently appearing on arXiv, Lin and Jordan (2022) also establish similar results in this setting without requiring a binary search procedure. Our results were derived independently from theirs.

2 Preliminaries

Let be a closed convex set in . We let denote any norm and denotes a prox function that is strongly convex with respect to norm , i.e.,

We let denote the Bregman divergence of , i.e.,

(1)

2.1 Background Results

We first recall the standard Three Point Property of the Bregman divergence, which generalizes the law of cosines.

Lemma 2.1 (Three Point Property).

Let denote the Bregman divergence of a function . The Three Point Property states, for any ,

Lemma 2.2 (Tseng (2008)).

Let be a convex function, let , and let

Then, for all ,

The following lemma follows from convexity.

Lemma 2.3 (Bullins and Lai (2020)).

Let for all and let . Then .

2.2 Monotone Variational Inequalities

In this section, we will formally define our problem and some definitions for higher-order derivatives.

Definition 2.4 (Directional Derivative).

Consider a -times differentiable operator . For , we let

denote, for , the directional derivative of a at along .

Definition 2.5 (Monotone Operator).

Consider an operator . We say that is monotone if,

Equivalently, an operator is monotone if its Jacobian is positive semidefinite.

Definition 2.6 (Higher Order Smooth Operator).

For , an operator is order -smooth with respect to norm if the higher order derivative of satisfies

(2)

or

(3)

where we let

denote the order Taylor expansion of , and we let

denote the operator norm.

We next define the two kinds of solutions associated with the variational inequality problem for an operator .

Definition 2.7 (Weak and Strong Solutions).

For an operator , a strong solution to the variational inequality problem associated with is a point satisfying,

A weak solution to the variational inequality Problem associated with is a point satisfying,

Definition 2.8 (Our MVI Problem).

Let and operator be monotone, continuous and -order -smooth with respect to a norm . Our MVI problem asks for an satisfying,

3 Algorithm

Definition 3.1 (Oracle).

We assume access to an oracle which, for any , solves the following variational inequality problem:

where

1:procedure MVI-OPT()
2:     for  to  do
3:         
4:         
5:               
6:     return
Algorithm 1 Algorithm for Higher-Order Smooth MVI Optimization
Lemma 3.2.

For any and , the iterates and parameters satisfy

Proof.

For any and any , we first apply Lemma 2.2 with , which gives us

(4)

Additionally, the guarantee of Definition 3.1 with yields

(5)

Applying Lemma 2.2 and the definition of to Equation 5, we have

(6)

Summing Equations 4 and 6, we obtain

(7)

Now, we obtain

Here, used Hölder’s inequality, used Definition 2.6, used the -strong convexity of , and used the inequality for . Combining with 3 and rearranging yields

(8)

We observe that . Applying this fact and summing over all iterations yields

as desired. ∎

We now state and prove our main theorem.

Theorem 3.3.

Let , and be any closed convex set. Let be an operator that is -order -smooth with respect to an arbitrary norm . Let denote the Bregman divergence of a function that is strongly convex with respect to the same norm . Algorithm 1 returns such that ,

in at most

calls to an oracle that solves the subproblem defined in Definition 3.1.

Proof.

Let . We first note that,

(From monotonicity of )
(From Lemma 3.2 and )

It is now sufficient to find a lower bound on . We will use Lemma 2.3 for . Observe from Lemma 3.2 that

Now, Lemma 2.3 would give

We thus have for all ,

which gives an approximate solution after iterations. ∎

References

  • N. Agarwal and E. Hazan (2018) Lower bounds for higher-order convex optimization. In Conference On Learning Theory, pp. 774–792. Cited by: §1.
  • Y. Arjevani, O. Shamir, and R. Shiff (2019) Oracle complexity of second-order methods for smooth convex optimization. Mathematical Programming 178 (1), pp. 327–360. Cited by: §1.
  • B. Bullins and K. A. Lai (2020) Higher-order methods for convex-concave min-max optimization and monotone variational inequalities. arXiv preprint arXiv:2007.04528. Cited by: §1, Lemma 2.3.
  • B. Bullins (2020) Highly smooth minimization of non-smooth problems. In Conference on Learning Theory, pp. 988–1030. Cited by: §1.
  • A. Gasnikov, P. Dvurechensky, E. Gorbunov, E. Vorontsova, D. Selikhanovych, C. A. Uribe, B. Jiang, H. Wang, S. Zhang, S. Bubeck, et al. (2019) Near optimal methods for minimizing convex functions with lipschitz -th derivatives. In Conference on Learning Theory, pp. 1392–1393. Cited by: §1.
  • R. Jiang and A. Mokhtari (2022) Generalized optimistic methods for convex-concave saddle point problems. arXiv preprint arXiv:2202.09674. Cited by: Line Search-Free Methods for Higher-Order Smooth Monotone Variational Inequalities, §1.
  • G. M. Korpelevich (1976) The extragradient method for finding saddle points and other problems. Matecon 12, pp. 747–756. Cited by: §1.
  • T. Lin and M. I. Jordan (2022) Perseus: a simple high-order regularization method for variational inequalities. arXiv preprint arXiv:2205.03202. Cited by: §1, §1.
  • R. D. Monteiro and B. F. Svaiter (2012) Iteration-complexity of a Newton proximal extragradient method for monotone variational inequalities and inclusion problems. SIAM Journal on Optimization 22 (3), pp. 914–935. Cited by: §1, §1.
  • R. D. Monteiro and B. F. Svaiter (2010) On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean. SIAM Journal on Optimization 20 (6), pp. 2755–2787. Cited by: §1.
  • R. D. Monteiro and B. F. Svaiter (2013) An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM Journal on Optimization 23 (2), pp. 1092–1125. Cited by: §1.
  • A. Nemirovski (2004) Prox-method with rate of convergence o (1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization 15 (1), pp. 229–251. Cited by: §1, §1.
  • Y. Nesterov and B. T. Polyak (2006) Cubic regularization of newton method and its global performance. Mathematical Programming 108 (1), pp. 177–205. Cited by: §1.
  • Y. Nesterov (2006) Cubic regularization of newton’s method for convex problems with constraints. Technical report CORE. Cited by: §1.
  • Y. Nesterov (2007) Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programming 109 (2), pp. 319–344. Cited by: §1.
  • Y. Ouyang and Y. Xu (2021) Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming 185 (1), pp. 1–35. Cited by: §1.
  • P. Tseng (2008) Accelerated proximal gradient methods for convex optimization. Technical report University of Washington, Seattle. Cited by: Lemma 2.2.