An Analysis of Reinforcement Learning for Malaria Control

07/19/2021
by   Ndivhuwo Makondo, et al.
0

Previous work on policy learning for Malaria control has often formulated the problem as an optimization problem assuming the objective function and the search space have a specific structure. The problem has been formulated as multi-armed bandits, contextual bandits and a Markov Decision Process in isolation. Furthermore, an emphasis is put on developing new algorithms specific to an instance of Malaria control, while ignoring a plethora of simpler and general algorithms in the literature. In this work, we formally study the formulation of Malaria control and present a comprehensive analysis of several formulations used in the literature. In addition, we implement and analyze several reinforcement learning algorithms in all formulations and compare them to black box optimization. In contrast to previous work, our results show that simple algorithms based on Upper Confidence Bounds are sufficient for learning good Malaria policies, and tend to outperform their more advanced counterparts on the malaria OpenAI Gym environment.

READ FULL TEXT
research
06/22/2021

A Unified Framework for Conservative Exploration

We study bandits and reinforcement learning (RL) subject to a conservati...
research
07/15/2020

Upper Counterfactual Confidence Bounds: a New Optimism Principle for Contextual Bandits

The principle of optimism in the face of uncertainty is one of the most ...
research
11/11/2018

Adapting multi-armed bandits policies to contextual bandits scenarios

This work explores adaptations of successful multi-armed bandits policie...
research
05/01/2022

Processing Network Controls via Deep Reinforcement Learning

Novel advanced policy gradient (APG) algorithms, such as proximal policy...
research
02/10/2021

Non-stationary Reinforcement Learning without Prior Knowledge: An Optimal Black-box Approach

We propose a black-box reduction that turns a certain reinforcement lear...
research
07/13/2019

Parameterized Exploration

We introduce Parameterized Exploration (PE), a simple family of methods ...
research
04/10/2019

Charging control of electric vehicles using contextual bandits considering the electrical distribution grid

With the proliferation of electric vehicles, the electrical distribution...

Please sign up or login with your details

Forgot password? Click here to reset