Smoothed Online Learning is as Easy as Statistical Learning

by   Adam Block, et al.

Much of modern learning theory has been split between two regimes: the classical offline setting, where data arrive independently, and the online setting, where data arrive adversarially. While the former model is often both computationally and statistically tractable, the latter requires no distributional assumptions. In an attempt to achieve the best of both worlds, previous work proposed the smooth online setting where each sample is drawn from an adversarially chosen distribution, which is smooth, i.e., it has a bounded density with respect to a fixed dominating measure. We provide tight bounds on the minimax regret of learning a nonparametric function class, with nearly optimal dependence on both the horizon and smoothness parameters. Furthermore, we provide the first oracle-efficient, no-regret algorithms in this setting. In particular, we propose an oracle-efficient improper algorithm whose regret achieves optimal dependence on the horizon and a proper algorithm requiring only a single oracle call per round whose regret has the optimal horizon dependence in the classification setting and is sublinear in general. Both algorithms have exponentially worse dependence on the smoothness parameter of the adversary than the minimax rate. We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning. Finally, we apply our results to the contextual bandit setting to show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits in the case that contexts arrive in a smooth manner.


page 1

page 2

page 3

page 4


Online Improper Learning with an Approximation Oracle

We revisit the question of reducing online learning to approximate optim...

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

Due to the drastic gap in complexity between sequential and batch statis...

Contextual Inverse Optimization: Offline and Online Learning

We study the problems of offline and online contextual optimization with...

Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games

Which classes can be learned properly in the online model? – that is, by...

Efficient Optimal Learning for Contextual Bandits

We address the problem of learning in an online setting where the learne...

Online Learning: Stochastic and Constrained Adversaries

Learning theory has largely focused on two main learning scenarios. The ...

Taking a hint: How to leverage loss predictors in contextual bandits?

We initiate the study of learning in contextual bandits with the help of...