Chat Image Generator Video Music Voice Chat Photo Editor

Generalized Risk-Aversion in Stochastic Multi-Armed Bandits

05/05/2014

∙

We consider the problem of minimizing the regret in stochastic multi-armed bandit, when the measure of goodness of an arm is not the mean return, but some general function of the mean and the variance.We characterize the conditions under which learning is possible and present examples for which no natural algorithm can achieve sublinear regret.

READ FULL TEXT

Success!

An error occurred

Generalized Risk-Aversion in Stochastic Multi-Armed Bandits

Sign in with Google

Consider DeepAI Pro