Note on Thompson sampling for large decision problems

05/12/2019
by   Tao Hu, et al.
0

There is increasing interest in using streaming data to inform decision making across a wide range of application domains including mobile health, food safety, security, and resource management. A decision support system formalizes online decision making as a map from up-to-date information to a recommended decision. Online estimation of an optimal decision strategy from streaming data requires simultaneous estimation of components of the underlying system dynamics as well as the optimal decision strategy given these dynamics; thus, there is an inherent trade-off between choosing decisions that lead to improved estimates and choosing decisions that appear to be optimal based on current estimates. Thompson (1933) was among the first to formalize this trade-off in the context of choosing between two treatments for a stream of patients; he proposed a simple heuristic wherein a treatment is selected randomly at each time point with selection probability proportional to the posterior probability that it is optimal. We consider a variant of Thompson sampling that is simple to implement and can be applied to large and complex decision problems. We show that the proposed Thompson sampling estimator is consistent for the optimal decision support system and provide rates of convergence and finite sample error bounds. The proposed algorithm is illustrated using an agent-based model of the spread of influenza on a network and management of mallard populations in the United States.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2022

A Note on Model-Free Reinforcement Learning with the Decision-Estimation Coefficient

We consider the problem of interactive decision making, encompassing str...
research
10/14/2020

Statistical Inference for Online Decision Making via Stochastic Gradient Descent

Online decision making aims to learn the optimal decision rule by making...
research
01/30/2013

Decision Theoretic Foundations of Graphical Model Selection

This paper describes a decision theoretic formulation of learning the gr...
research
11/08/2022

A Simple Algorithm for Online Decision Making

Motivated by recent progress on online linear programming (OLP), we stud...
research
11/16/2021

Sequential Unequal Probability Sampling For Stream Population

A new unequal probability sampling method is proposed. This method is se...
research
10/24/2016

Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition

We develop a Bayesian model for decision-making under time pressure with...
research
03/14/2023

Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data

The Internet of Things (IoT) system generates massive high-speed tempora...

Please sign up or login with your details

Forgot password? Click here to reset