We consider the problem of monotone variational inequalities (MVI), a well-studied setting which captures convex optimization and convex-concave saddle-point problems, among others. While convex optimization arises as a special case when choosing the gradient operator of the function, the MVI framework allows for more general operators which go beyond the scope of convex problems. There has been particular interest in developing methods for smooth MVIs, similar to the case of smooth convex optimization. The Mirror Prox method of Nemirovski (2004) for MVIs, which generalizes the extragradient method (Korpelevich, 1976), achieves an iteration complexity. Later work by Nesterov (2007) arrives at the same iteration complexity, though it extends the extragradient approach to the dual space, resulting in what is known as the dual extrapolation method. Furthermore, it has been shown that such rates are tight, under first-order oracle access, for smooth convex-concave saddle point problems (Ouyang and Xu, 2021), which are a special case of the MVI problem.
In addition, several works have studied MVIs in contexts beyond standard smoothness conditions. Nesterov (2006) considers a second-order approach inspired by the cubic regularization method (Nesterov and Polyak, 2006), for MVIs where the Jacobian of the operator is Lipschitz continuous (referred to as second-order smoothness, i.e. ), and achieves an rate. Under the same assumption of second-order smoothness, Monteiro and Svaiter (2012) show how to achieve an improved convergence rate of . Their approach extends their previous work (Monteiro and Svaiter, 2010) on the hybrid proximal extragradient (HPE) method, which has garnered much interest in recent years in the context of faster accelerated methods for higher-order smooth convex optimization (Monteiro and Svaiter, 2013; Gasnikov et al., 2019; Bullins, 2020) that are near-optimal in terms of higher-order oracle complexity (Agarwal and Hazan, 2018; Arjevani et al., 2019).
These results have since been generalized to -order smooth MVIs (Bullins and Lai, 2020; Jiang and Mokhtari, 2022), whereby they have established rates of the form . However, a drawback for these works, as well as for Monteiro and Svaiter (2012), is that they have all required an additional line search procedure. In this note, we provide a simple method and analysis, inspired by the Mirror Prox method of Nemirovski (2004), which achieves the rate without requiring a binary search.
Independent work by Lin and Jordan (2022).
Concurrently appearing on arXiv, Lin and Jordan (2022) also establish similar results in this setting without requiring a binary search procedure. Our results were derived independently from theirs.
Let be a closed convex set in . We let denote any norm and denotes a prox function that is strongly convex with respect to norm , i.e.,
We let denote the Bregman divergence of , i.e.,
2.1 Background Results
We first recall the standard Three Point Property of the Bregman divergence, which generalizes the law of cosines.
Lemma 2.1 (Three Point Property).
Let denote the Bregman divergence of a function . The Three Point Property states, for any ,
Lemma 2.2 (Tseng (2008)).
Let be a convex function, let , and let
Then, for all ,
The following lemma follows from convexity.
Lemma 2.3 (Bullins and Lai (2020)).
Let for all and let . Then .
2.2 Monotone Variational Inequalities
In this section, we will formally define our problem and some definitions for higher-order derivatives.
Definition 2.4 (Directional Derivative).
Consider a -times differentiable operator . For , we let
denote, for , the directional derivative of a at along .
Definition 2.5 (Monotone Operator).
Consider an operator . We say that is monotone if,
Equivalently, an operator is monotone if its Jacobian is positive semidefinite.
Definition 2.6 (Higher Order Smooth Operator).
For , an operator is order -smooth with respect to norm if the higher order derivative of satisfies
where we let
denote the order Taylor expansion of , and we let
denote the operator norm.
We next define the two kinds of solutions associated with the variational inequality problem for an operator .
Definition 2.7 (Weak and Strong Solutions).
For an operator , a strong solution to the variational inequality problem associated with is a point satisfying,
A weak solution to the variational inequality Problem associated with is a point satisfying,
Definition 2.8 (Our MVI Problem).
Let and operator be monotone, continuous and -order -smooth with respect to a norm . Our MVI problem asks for an satisfying,
Definition 3.1 (Oracle).
We assume access to an oracle which, for any , solves the following variational inequality problem:
For any and , the iterates and parameters satisfy
For any and any , we first apply Lemma 2.2 with , which gives us
Additionally, the guarantee of Definition 3.1 with yields
We now state and prove our main theorem.
Let , and be any closed convex set. Let be an operator that is -order -smooth with respect to an arbitrary norm . Let denote the Bregman divergence of a function that is strongly convex with respect to the same norm . Algorithm 1 returns such that ,
in at most
calls to an oracle that solves the subproblem defined in Definition 3.1.
- Lower bounds for higher-order convex optimization. In Conference On Learning Theory, pp. 774–792. Cited by: §1.
- Oracle complexity of second-order methods for smooth convex optimization. Mathematical Programming 178 (1), pp. 327–360. Cited by: §1.
- Higher-order methods for convex-concave min-max optimization and monotone variational inequalities. arXiv preprint arXiv:2007.04528. Cited by: §1, Lemma 2.3.
- Highly smooth minimization of non-smooth problems. In Conference on Learning Theory, pp. 988–1030. Cited by: §1.
- Near optimal methods for minimizing convex functions with lipschitz -th derivatives. In Conference on Learning Theory, pp. 1392–1393. Cited by: §1.
- Generalized optimistic methods for convex-concave saddle point problems. arXiv preprint arXiv:2202.09674. Cited by: Line Search-Free Methods for Higher-Order Smooth Monotone Variational Inequalities, §1.
- The extragradient method for finding saddle points and other problems. Matecon 12, pp. 747–756. Cited by: §1.
- Perseus: a simple high-order regularization method for variational inequalities. arXiv preprint arXiv:2205.03202. Cited by: §1, §1.
- Iteration-complexity of a Newton proximal extragradient method for monotone variational inequalities and inclusion problems. SIAM Journal on Optimization 22 (3), pp. 914–935. Cited by: §1, §1.
- On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean. SIAM Journal on Optimization 20 (6), pp. 2755–2787. Cited by: §1.
- An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM Journal on Optimization 23 (2), pp. 1092–1125. Cited by: §1.
- Prox-method with rate of convergence o (1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization 15 (1), pp. 229–251. Cited by: §1, §1.
- Cubic regularization of newton method and its global performance. Mathematical Programming 108 (1), pp. 177–205. Cited by: §1.
- Cubic regularization of newton’s method for convex problems with constraints. Technical report CORE. Cited by: §1.
- Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programming 109 (2), pp. 319–344. Cited by: §1.
- Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming 185 (1), pp. 1–35. Cited by: §1.
- Accelerated proximal gradient methods for convex optimization. Technical report University of Washington, Seattle. Cited by: Lemma 2.2.