Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit Problems

08/17/2020
by   Fabio Massimo Zennaro, et al.
0

The multi-armed bandit problem is a classical decision-making problem where an agent has to learn an optimal action balancing exploration and exploitation. Properly managing this trade-off requires a correct assessment of uncertainty; in multi-armed bandits, as in other machine learning applications, it is important to distinguish between stochasticity that is inherent to the system (aleatoric uncertainty) and stochasticity that derives from the limited knowledge of the agent (epistemic uncertainty). In this paper we consider the formalism of subjective logic, a concise and expressive framework to express Dirichlet-multinomial models as subjective opinions, and we apply it to the problem of multi-armed bandits. We propose new algorithms grounded in subjective logic to tackle the multi-armed bandit problem, we compare them against classical algorithms from the literature, and we analyze the insights they provide in evaluating the dynamics of uncertainty. Our preliminary results suggest that subjective logic quantities enable useful assessment of uncertainty that may be exploited by more refined agents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Optimal Activation of Halting Multi-Armed Bandit Models

We study new types of dynamic allocation problems the Halting Bandit mod...
research
07/25/2020

Multi-Armed Bandits for Minesweeper: Profiting from Exploration-Exploitation Synergy

A popular computer puzzle, the game of Minesweeper requires its human pl...
research
07/12/2019

Gittins' theorem under uncertainty

We study dynamic allocation problems for discrete time multi-armed bandi...
research
12/01/2021

Learned Autoscaling for Cloud Microservices with Multi-Armed Bandits

As cloud applications shift from monolithic architectures to loosely cou...
research
10/14/2020

Asymptotic Randomised Control with applications to bandits

We consider a general multi-armed bandit problem with correlated (and si...
research
04/26/2022

Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling

As two popular schools of machine learning, online learning and evolutio...
research
10/13/2021

Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

Training data for machine translation (MT) is often sourced from a multi...

Please sign up or login with your details

Forgot password? Click here to reset