Thompson Sampling with Diffusion Generative Prior

01/12/2023
by   Yu-Guan Hsieh, et al.
0

In this work, we initiate the idea of using denoising diffusion models to learn priors for online decision making problems. Our special focus is on the meta-learning for bandit framework, with the goal of learning a strategy that performs well across bandit tasks of a same class. To this end, we train a diffusion model that learns the underlying task distribution and combine Thompson sampling with the learned prior to deal with new tasks at test time. Our posterior sampling algorithm is designed to carefully balance between the learned prior and the noisy observations that come from the learner's interaction with the environment. To capture realistic bandit scenarios, we also propose a novel diffusion model training procedure that trains even from incomplete and/or noisy data, which could be of independent interest. Finally, our extensive experimental evaluations clearly demonstrate the potential of the proposed approach.

READ FULL TEXT

page 30

page 31

page 32

page 33

page 34

page 35

page 36

page 37

research
02/05/2023

Diffusion Model for Generative Image Denoising

In supervised learning for image denoising, usually the paired clean ima...
research
06/29/2021

Diffusion Priors In Variational Autoencoders

Among likelihood-based approaches for deep generative modelling, variati...
research
06/05/2020

Learning Multiclass Classifier Under Noisy Bandit Feedback

This paper addresses the problem of multiclass classification with corru...
research
07/13/2021

No Regrets for Learning the Prior in Bandits

We propose AdaTS, a Thompson sampling algorithm that adapts sequentially...
research
08/16/2017

Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors

Thompson sampling has impressive empirical performance for many multi-ar...
research
08/10/2023

Masked Diffusion as Self-supervised Representation Learner

Denoising diffusion probabilistic models have recently demonstrated stat...
research
06/17/2022

Diffusion models as plug-and-play priors

We consider the problem of inferring high-dimensional data 𝐱 in a model ...

Please sign up or login with your details

Forgot password? Click here to reset