Saddle Point Optimization with Approximate Minimization Oracle
A major approach to saddle point optimization min_xmax_y f(x, y) is a gradient based approach as is popularized by generative adversarial networks (GANs). In contrast, we analyze an alternative approach relying only on an oracle that solves a minimization problem approximately. Our approach locates approximate solutions x' and y' to min_x'f(x', y) and max_y'f(x, y') at a given point (x, y) and updates (x, y) toward these approximate solutions (x', y') with a learning rate η. On locally strong convex–concave smooth functions, we derive conditions on η to exhibit linear convergence to a local saddle point, which reveals a possible shortcoming of recently developed robust adversarial reinforcement learning algorithms. We develop a heuristic approach to adapt η derivative-free and implement zero-order and first-order minimization algorithms. Numerical experiments are conducted to show the tightness of the theoretical results as well as the usefulness of the η adaptation mechanism.
READ FULL TEXT