Robust Experimentation in the Continuous Time Bandit Problem

03/31/2021
by   Farzad Pourbabaee, et al.
0

We study the experimentation dynamics of a decision maker (DM) in a two-armed bandit setup (Bolton and Harris (1999)), where the agent holds ambiguous beliefs regarding the distribution of the return process of one arm and is certain about the other one. The DM entertains Multiplier preferences a la Hansen and Sargent (2001), thus we frame the decision making environment as a two-player differential game against nature in continuous time. We characterize the DM value function and her optimal experimentation strategy that turns out to follow a cut-off rule with respect to her belief process. The belief threshold for exploring the ambiguous arm is found in closed form and is shown to be increasing with respect to the ambiguity aversion index. We then study the effect of provision of an unambiguous information source about the ambiguous arm. Interestingly, we show that the exploration threshold rises unambiguously as a result of this new information source, thereby leading to more conservatism. This analysis also sheds light on the efficient time to reach for an expert opinion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2018

A General Framework of Multi-Armed Bandit Processes by Arm Switch Restrictions

This paper proposes a general framework of multi-armed bandit (MAB) proc...
research
08/11/2022

Understanding the stochastic dynamics of sequential decision-making processes: A path-integral analysis of Multi-armed Bandits

The multi-armed bandit (MAB) model is one of the most classical models t...
research
08/20/2018

A General Framework of Multi-Armed Bandit Processes by Switching Restrictions

This paper proposes a general framework of multi-armed bandit (MAB) proc...
research
08/03/2019

Multiplayer Bandit Learning, from Competition to Cooperation

The stochastic multi-armed bandit problem is a classic model illustratin...
research
01/31/2022

Rotting infinitely many-armed bandits

We consider the infinitely many-armed bandit problem with rotting reward...
research
07/08/2022

Malliavin calculus and its application to robust optimal portfolio for an insider

Insider information and model uncertainty are two unavoidable problems f...
research
07/26/2023

Evaluating the Moral Beliefs Encoded in LLMs

This paper presents a case study on the design, administration, post-pro...

Please sign up or login with your details

Forgot password? Click here to reset