Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control

by   Yangchen Pan, et al.

Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE). This paper shows how to use RL to tackle more general PDE control problems that have continuous high-dimensional action spaces with spatial relationship among action dimensions. In particular, we propose the concept of action descriptors, which encode regularities among spatially-extended action dimensions and enable the agent to control high-dimensional action PDEs. We provide theoretical evidence suggesting that this approach can be more sample efficient compared to a conventional approach that treats each action dimension separately and does not explicitly exploit the spatial regularity of the action space. The action descriptor approach is then used within the deep deterministic policy gradient algorithm. Experiments on two PDE control problems, with up to 256-dimensional continuous actions, show the advantage of the proposed approach over the conventional one.


Deep Reinforcement Learning for Online Control of Stochastic Partial Differential Equations

In many areas, such as the physical sciences, life sciences, and finance...

Actor-Critic Algorithm for High-dimensional Partial Differential Equations

We develop a deep learning model to effectively solve high-dimensional n...

Incremental Reinforcement Learning --- a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods

Continuous reinforcement learning such as DDPG and A3C are widely used i...

A Crash Course on Reinforcement Learning

The emerging field of Reinforcement Learning (RL) has led to impressive ...

Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

This article shows how the recent breakthroughs in Reinforcement Learnin...

Distributional Offline Continuous-Time Reinforcement Learning with Neural Physics-Informed PDEs (SciPhy RL for DOCTR-L)

This paper addresses distributional offline continuous-time reinforcemen...

Recovering a probability measure from its multivariate spatial rank

We address the problem of recovering a probability measure P over ^n (e....