Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

06/13/2020
by   Arun Sai Suggala, et al.
0

We consider the problem of online learning and its application to solving minimax games. For the online learning problem, Follow the Perturbed Leader (FTPL) is a widely studied algorithm which enjoys the optimal O(T^1/2) worst-case regret guarantee for both convex and nonconvex losses. In this work, we show that when the sequence of loss functions is predictable, a simple modification of FTPL which incorporates optimism can achieve better regret guarantees, while retaining the optimal worst-case regret guarantee for unpredictable sequences. A key challenge in obtaining these tighter regret bounds is the stochasticity and optimism in the algorithm, which requires different analysis techniques than those commonly used in the analysis of FTPL. The key ingredient we utilize in our analysis is the dual view of perturbation as regularization. While our algorithm has several applications, we consider the specific application of minimax games. For solving smooth convex-concave games, our algorithm only requires access to a linear optimization oracle. For Lipschitz and smooth nonconvex-nonconcave games, our algorithm requires access to an optimization oracle which computes the perturbed best response. In both these settings, our algorithm solves the game up to an accuracy of O(T^-1/2) using T calls to the optimization oracle. An important feature of our algorithm is that it is highly parallelizable and requires only O(T^1/2) iterations, with each iteration making O(T^1/2) parallel calls to the optimization oracle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2019

Online Non-Convex Learning: Following the Perturbed Leader is Optimal

We study the problem of online learning with non-convex losses, where th...
research
02/10/2023

Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making

Smoothed online learning has emerged as a popular framework to mitigate ...
research
10/17/2018

Learning in Non-convex Games with an Optimization Oracle

We consider adversarial online learning in a non-convex setting under th...
research
10/17/2022

Adaptive Oracle-Efficient Online Learning

The classical algorithms for online learning and decision-making have th...
research
12/11/2019

Near-optimal Oracle-efficient Algorithms for Stationary and Non-Stationary Stochastic Linear Bandits

We investigate the design of two algorithms that enjoy not only computat...
research
04/04/2012

Relax and Localize: From Value to Algorithms

We show a principled way of deriving online learning algorithms from a m...
research
02/06/2020

Regret analysis of the Piyavskii-Shubert algorithm for global Lipschitz optimization

We consider the problem of maximizing a non-concave Lipschitz multivaria...

Please sign up or login with your details

Forgot password? Click here to reset