A multilevel reinforcement learning framework for PDE based control

by   Atish Dixit, et al.

Reinforcement learning (RL) is a promising method to solve control problems. However, model-free RL algorithms are sample inefficient and require thousands if not millions of samples to learn optimal control policies. A major source of computational cost in RL corresponds to the transition function, which is dictated by the model dynamics. This is especially problematic when model dynamics is represented with coupled PDEs. In such cases, the transition function often involves solving a large-scale discretization of the said PDEs. We propose a multilevel RL framework in order to ease this cost by exploiting sublevel models that correspond to coarser scale discretization (i.e. multilevel models). This is done by formulating an approximate multilevel Monte Carlo estimate of the objective function of the policy and / or value network instead of Monte Carlo estimates, as done in the classical framework. As a demonstration of this framework, we present a multilevel version of the proximal policy optimization (PPO) algorithm. Here, the level refers to the grid fidelity of the chosen simulation-based environment. We provide two examples of simulation-based environments that employ stochastic PDEs that are solved using finite-volume discretization. For the case studies presented, we observed substantial computational savings using multilevel PPO compared to its classical counterpart.


page 12

page 14

page 15

page 16


Robust optimal well control using an adaptive multi-grid reinforcement learning framework

Reinforcement learning (RL) is a promising tool to solve robust optimal ...

MG/OPT and MLMC for Robust Optimization of PDEs

An algorithm is proposed to solve robust control problems constrained by...

Algorithms for Solving High Dimensional PDEs: From Nonlinear Monte Carlo to Machine Learning

In recent years, tremendous progress has been made on numerical algorith...

Hamiltonian Q-Learning: Leveraging Importance-sampling for Data Efficient RL

Model-free reinforcement learning (RL), in particular Q-learning is wide...

ACERAC: Efficient reinforcement learning in fine time discretization

We propose a framework for reinforcement learning (RL) in fine time disc...

Multilevel Minimization for Deep Residual Networks

We present a new multilevel minimization framework for the training of d...

Stable Reinforcement Learning with Unbounded State Space

We consider the problem of reinforcement learning (RL) with unbounded st...

Please sign up or login with your details

Forgot password? Click here to reset