Neural Bandit with Arm Group Graph

06/08/2022
by   Yunzhe Qi, et al.
0

Contextual bandits aim to identify among a set of arms the optimal one with the highest reward based on their contextual information. Motivated by the fact that the arms usually exhibit group behaviors and the mutual impacts exist among groups, we introduce a new model, Arm Group Graph (AGG), where the nodes represent the groups of arms and the weighted edges formulate the correlations among groups. To leverage the rich information in AGG, we propose a bandit algorithm, AGG-UCB, where the neural networks are designed to estimate rewards, and we propose to utilize graph neural networks (GNN) to learn the representations of arm groups with correlations. To solve the exploitation-exploration dilemma in bandits, we derive a new upper confidence bound (UCB) built on neural networks (exploitation) for exploration. Furthermore, we prove that AGG-UCB can achieve a near-optimal regret bound with over-parameterized neural networks, and provide the convergence analysis of GNN with fully-connected layers which may be of independent interest. In the end, we conduct extensive experiments against state-of-the-art baselines on multiple public data sets, showing the effectiveness of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2023

Graph Neural Bandits

Contextual bandits algorithms aim to choose the optimal arm with the hig...
research
06/06/2021

Multi-facet Contextual Bandits: A Neural Network Perspective

Contextual multi-armed bandit has shown to be an effective tool in recom...
research
11/29/2021

Contextual Combinatorial Volatile Bandits with Satisfying via Gaussian Processes

In many real-world applications of combinatorial bandits such as content...
research
06/15/2021

Thompson Sampling for Unimodal Bandits

In this paper, we propose a Thompson Sampling algorithm for unimodal ban...
research
10/04/2022

Max-Quantile Grouped Infinite-Arm Bandits

In this paper, we consider a bandit problem in which there are a number ...
research
12/09/2019

Group Fairness in Bandit Arm Selection

We consider group fairness in the contextual bandit setting. Here, a seq...
research
04/19/2018

Exploring Partially Observed Networks with Nonparametric Bandits

Real-world networks such as social and communication networks are too la...

Please sign up or login with your details

Forgot password? Click here to reset