Online Saddle Point Problem with Applications to Constrained Online Convex Optimization

06/21/2018

∙

We study an online saddle point problem where at each iteration a pair of actions need to be chosen without knowledge of the future (convex-concave) payoff functions. The objective is to minimize the gap between the cumulative payoffs and the saddle point value of the aggregate payoff function, which we measure using a metric called "SP-regret". The problem generalizes the online convex optimization framework and can be interpreted as finding the Nash equilibrium for the aggregate of a sequence of two-player zero-sum games. We propose an algorithm that achieves Õ(√(T)) SP-regret in the general case, and O( T) SP-regret for the strongly convex-concave case. We then consider a constrained online convex optimization problem motivated by a variety of applications in dynamic pricing, auctions, and crowdsourcing. We relate this problem to an online saddle point problem and establish O(√(T)) regret using a primal-dual algorithm.

READ FULL TEXT

Online Saddle Point Problem with Applications to Constrained Online Convex Optimization

Sign in with Google

Consider DeepAI Pro