Online Learning with Switching Costs and Other Adaptive Adversaries

02/18/2013
by   Nicolò Cesa-Bianchi, et al.
0

We study the power of different types of adaptive (nonoblivious) adversaries in the setting of prediction with expert advice, under both full-information and bandit feedback. We measure the player's performance using a new notion of regret, also known as policy regret, which better captures the adversary's adaptiveness to the player's behavior. In a setting where losses are allowed to drift, we characterize ---in a nearly complete manner--- the power of adaptive adversaries with bounded memories and switching costs. In particular, we show that with switching costs, the attainable rate with bandit feedback is Θ(T^2/3). Interestingly, this rate is significantly worse than the Θ(√(T)) rate attainable with switching costs in the full-information case. Via a novel reduction from experts to bandits, we also show that a bounded memory adversary can force Θ(T^2/3) regret even in the full information case, proving that switching costs are easier to control than bounded memory adversaries. Our lower bounds rely on a new stochastic adversary strategy that generates loss processes with strong dependencies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2018

Online learning with graph-structured feedback against adaptive adversaries

We derive upper and lower bounds for the policy regret of T-round online...
research
06/27/2012

Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret

Online learning algorithms are designed to learn even when their input i...
research
04/24/2022

Complete Policy Regret Bounds for Tallying Bandits

Policy regret is a well established notion of measuring the performance ...
research
10/24/2019

Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition

We study the problem of switching-constrained online convex optimization...
research
12/20/2017

Tracking Cyber Adversaries with Adaptive Indicators of Compromise

A forensics investigation after a breach often uncovers network and host...
research
10/24/2022

Private Online Prediction from Experts: Separations and Faster Rates

Online prediction from experts is a fundamental problem in machine learn...
research
10/31/2020

Prediction against limited adversary

We study the problem of prediction with expert advice with adversarial c...

Please sign up or login with your details

Forgot password? Click here to reset