Materials Discovery using Max K-Armed Bandit

12/16/2022
by   Nobuaki Kikkawa, et al.
0

Search algorithms for the bandit problems are applicable in materials discovery. However, the objectives of the conventional bandit problem are different from those of materials discovery. The conventional bandit problem aims to maximize the total rewards, whereas materials discovery aims to achieve breakthroughs in material properties. The max K-armed bandit (MKB) problem, which aims to acquire the single best reward, matches with the discovery tasks better than the conventional bandit. Thus, here, we propose a search algorithm for materials discovery based on the MKB problem using a pseudo-value of the upper confidence bound of expected improvement of the best reward. This approach is pseudo-guaranteed to be asymptotic oracles that do not depends on the time horizon. In addition, compared with other MKB algorithms, the proposed algorithm has only one hyperparameter, which is advantageous in materials discovery. We applied the proposed algorithm to synthetic problems and molecular-design demonstrations using a Monte Carlo tree search. According to the results, the proposed algorithm stably outperformed other bandit algorithms in the late stage of the search process when the optimal arm of the MKB could not be determined based on its expectation reward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/23/2015

The Max K-Armed Bandit: PAC Lower Bounds and Efficient Algorithms

We consider the Max K-Armed Bandit problem, where a learning agent is fa...
research
11/13/2018

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits

We propose a multi-armed bandit algorithm that explores based on randomi...
research
04/18/2021

Monte Carlo Elites: Quality-Diversity Selection as a Multi-Armed Bandit Problem

A core challenge of evolutionary search is the need to balance between e...
research
12/30/2021

Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Most algorithms for the multi-armed bandit problem in reinforcement lear...
research
07/27/2017

Max K-armed bandit: On the ExtremeHunter algorithm and beyond

This paper is devoted to the study of the max K-armed bandit problem, wh...
research
01/30/2020

HAMLET – A Learning Curve-Enabled Multi-Armed Bandit for Algorithm Selection

Automated algorithm selection and hyperparameter tuning facilitates the ...
research
05/16/2023

Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning

Balancing exploration and exploitation has been an important problem in ...

Please sign up or login with your details

Forgot password? Click here to reset