Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

12/30/2021
by   Bin Chong, et al.
0

Most algorithms for the multi-armed bandit problem in reinforcement learning aimed to maximize the expected reward, which are thus useful in searching the optimized candidate with the highest reward (function value) for diverse applications (e.g., AlphaGo). However, in some typical application scenaios such as drug discovery, the aim is to search a diverse set of candidates with high reward. Here we propose a reversible upper confidence bound (rUCB) algorithm for such a purpose, and demonstrate its application in virtual screening upon intrinsically disordered proteins (IDPs). It is shown that rUCB greatly reduces the query times while achieving both high accuracy and low performance loss.The rUCB may have potential application in multipoint optimization and other reinforcement-learning cases.

READ FULL TEXT
research
10/20/2016

Combinatorial Multi-Armed Bandit with General Reward Functions

In this paper, we study the stochastic combinatorial multi-armed bandit ...
research
10/08/2021

Deep Upper Confidence Bound Algorithm for Contextual Bandit Ranking of Information Selection

Contextual multi-armed bandits (CMAB) have been widely used for learning...
research
12/16/2022

Materials Discovery using Max K-Armed Bandit

Search algorithms for the bandit problems are applicable in materials di...
research
08/28/2023

Simple Modification of the Upper Confidence Bound Algorithm by Generalized Weighted Averages

The multi-armed bandit (MAB) problem is a classical problem that models ...
research
01/30/2020

HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection

Automated algorithm selection and hyperparameter tuning facilitates the ...
research
04/09/2019

A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents

This note gives a short, self-contained, proof of a sharp connection bet...

Please sign up or login with your details

Forgot password? Click here to reset