PAC-Bayesian Analysis of Martingales and Multiarmed Bandits

05/12/2011
by   Yevgeny Seldin, et al.
0

We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound concentration of martingale values. Our second approach is based on integration of Hoeffding-Azuma inequality with PAC-Bayesian analysis. We also introduce a way to apply PAC-Bayesian analysis in situation of limited feedback. We combine the new tools to derive PAC-Bayesian generalization and regret bounds for the multiarmed bandit problem. Although our regret bound is not yet as tight as state-of-the-art regret bounds based on other well-established techniques, our results significantly expand the range of potential applications of PAC-Bayesian analysis and introduce a new analysis tool to reinforcement learning and many other fields, where martingales and limited feedback are encountered.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2019

Efron-Stein PAC-Bayesian Inequalities

We prove semi-empirical concentration inequalities for random variables ...
research
10/31/2011

PAC-Bayesian Inequalities for Martingales

We present a set of high-probability inequalities that control the conce...
research
01/12/2021

A note on a confidence bound of Kuzborskij and Szepesvári

In an interesting recent work, Kuzborskij and Szepesvári derived a confi...
research
09/02/2010

A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

We formulate weighted graph clustering as a prediction problem: given a ...
research
04/27/2023

Exponential Stochastic Inequality

We develop the concept of exponential stochastic inequality (ESI), a nov...
research
02/09/2021

The Multiplicative Version of Azuma's Inequality, with an Application to Contention Analysis

Azuma's inequality is a tool for proving concentration bounds on random ...
research
06/01/2022

Split-kl and PAC-Bayes-split-kl Inequalities

We present a new concentration of measure inequality for sums of indepen...

Please sign up or login with your details

Forgot password? Click here to reset