Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

02/27/2022
by   Supriyo Ghosh, et al.
0

We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints and risk-based objectives such as conditional value-at-risk (CVaR) during the execution of the policy, using probabilistic models of the state transitions to guide policy adjustments. The framework is particularly amenable to the class of sequential resource allocation problems since feasibility with respect to typical resource constraints cannot be enforced in a scalable manner. The NP framework provides an alternative that adds modest overhead during the online phase. Experimental results demonstrate the efficacy of the NP framework on two continuous real-world tasks: (i) the portfolio optimization problem with liquidity constraints for financial planning, characterized by non-stationary state distributions; and (ii) the dynamic repositioning problem in bike sharing systems, that embodies the class of supply-demand matching problems. We show that the NP framework produces policies that are better than deep RL and other baseline approaches, adapting to non-stationarity, whilst satisfying structural constraints and accommodating risk measures in the resulting policies. Additional benefits of the NP framework are ease of implementation and better explainability of the policies.

READ FULL TEXT
research
04/19/2022

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

We consider the offline constrained reinforcement learning (RL) problem,...
research
06/26/2019

Approximate Dynamic Programming For Linear Systems with State and Input Constraints

Enforcing state and input constraints during reinforcement learning (RL)...
research
09/20/2021

CARL: Conditional-value-at-risk Adversarial Reinforcement Learning

In this paper we present a risk-averse reinforcement learning (RL) metho...
research
08/22/2019

Practical Risk Measures in Reinforcement Learning

Practical application of Reinforcement Learning (RL) often involves risk...
research
04/25/2019

DeepPR: Incremental Recovery for Interdependent VNFs with Deep Reinforcement Learning

The increasing reliance upon cloud services entails more flexible networ...
research
06/29/2022

Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning

We propose a novel framework to solve risk-sensitive reinforcement learn...
research
05/19/2021

Enforcing Policy Feasibility Constraints through Differentiable Projection for Energy Optimization

While reinforcement learning (RL) is gaining popularity in energy system...

Please sign up or login with your details

Forgot password? Click here to reset