Autoregressive Bandits

12/12/2022
by   Francesco Bacchiocchi, et al.
0

Autoregressive processes naturally arise in a large variety of real-world scenarios, including e.g., stock markets, sell forecasting, weather prediction, advertising, and pricing. When addressing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for converge to the optimal decision policy. In this work, we propose a novel online learning setting, named Autoregressive Bandits (ARBs), in which the observed reward follows an autoregressive process of order k, whose parameters depend on the action the agent chooses, within a finite set of n actions. Then, we devise an optimistic regret minimization algorithm AutoRegressive Upper Confidence Bounds (AR-UCB) that suffers regret of order 𝒪( (k+1)^3/2√(nT)/(1-Γ)^2), being T the optimization horizon and Γ < 1 an index of the stability of the system. Finally, we present a numerical validation in several synthetic and one real-world setting, in comparison with general and specific purpose bandit baselines showing the advantages of the proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2022

Dynamical Linear Bandits

In many real-world sequential decision-making problems, an action does n...
research
04/20/2019

Waterfall Bandits: Learning to Sell Ads Online

A popular approach to selling online advertising is by a waterfall, wher...
research
07/26/2023

Online learning in bandits with predicted context

We consider the contextual bandit problem where at each time, the agent ...
research
07/28/2022

Distributed Stochastic Bandit Learning with Context Distributions

We study the problem of distributed stochastic multi-arm contextual band...
research
06/10/2021

Thompson Sampling with a Mixture Prior

We study Thompson sampling (TS) in online decision-making problems where...
research
12/14/2022

Invariant Lipschitz Bandits: A Side Observation Approach

Symmetry arises in many optimization and decision-making problems, and h...
research
08/04/2022

Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing

Motivated by practical considerations in machine learning for financial ...

Please sign up or login with your details

Forgot password? Click here to reset