Follow the Leader If You Can, Hedge If You Must

01/03/2013
by   Steven de Rooij, et al.
0

Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has terrible performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combines the best of both worlds. As part of our construction, we develop AdaHedge, which is a new way of dynamically tuning the learning rate in Hedge without using the doubling trick. AdaHedge refines a method by Cesa-Bianchi, Mansour and Stoltz (2007), yielding slightly improved worst-case guarantees. By interleaving AdaHedge and FTL, the FlipFlop algorithm achieves regret within a constant factor of the FTL regret, without sacrificing AdaHedge's worst-case guarantees. AdaHedge and FlipFlop do not need to know the range of the losses in advance; moreover, unlike earlier methods, both have the intuitive property that the issued weights are invariant under rescaling and translation of the losses. The losses are also allowed to be negative, in which case they may be interpreted as gains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2011

Adaptive Hedge

Most methods for decision-theoretic online learning are based on the Hed...
research
06/10/2020

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

This work studies the problem of learning episodic Markov Decision Proce...
research
12/01/2011

Bandit Market Makers

We introduce a modular framework for market making. It combines cost-fun...
research
02/20/2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond

Best-of-both-worlds algorithms for online learning which achieve near-op...
research
07/28/2020

Distributionally Robust Losses for Latent Covariate Mixtures

While modern large-scale datasets often consist of heterogeneous subpopu...
research
02/14/2019

Procrastinating with Confidence: Near-Optimal, Anytime, Adaptive Algorithm Configuration

Algorithm configuration methods optimize the performance of a parameteri...
research
02/02/2019

Learning Linear Dynamical Systems with Semi-Parametric Least Squares

We analyze a simple prefiltered variation of the least squares estimator...

Please sign up or login with your details

Forgot password? Click here to reset