Chat Image Generator Video Music Voice Chat Photo Editor

Exploration by Optimisation in Partial Monitoring

07/12/2019

∙

We provide a simple and efficient algorithm for adversarial k-action d-outcome non-degenerate locally observable partial monitoring games for which the n-round minimax regret is bounded by 3(d+1) k^3/2√(8n (k)), matching the best known information-theoretic upper bounds.

READ FULL TEXT

Success!

An error occurred

Exploration by Optimisation in Partial Monitoring

Sign in with Google

Consider DeepAI Pro