Forward-PECVaR Algorithm: Exact Evaluation for CVaR SSPs

03/01/2023
by   Willy Arthur Silva Reis, et al.
0

The Stochastic Shortest Path (SSP) problem models probabilistic sequential-decision problems where an agent must pursue a goal while minimizing a cost function. Because of the probabilistic dynamics, it is desired to have a cost function that considers risk. Conditional Value at Risk (CVaR) is a criterion that allows modeling an arbitrary level of risk by considering the expectation of a fraction α of worse trajectories. Although an optimal policy is non-Markovian, solutions of CVaR-SSP can be found approximately with Value Iteration based algorithms such as CVaR Value Iteration with Linear Interpolation (CVaRVIQ) and CVaR Value Iteration via Quantile Representation (CVaRVILI). These type of solutions depends on the algorithm's parameters such as the number of atoms and α_0 (the minimum α). To compare the policies returned by these algorithms, we need a way to exactly evaluate stationary policies of CVaR-SSPs. Although there is an algorithm that evaluates these policies, this only works on problems with uniform costs. In this paper, we propose a new algorithm, Forward-PECVaR (ForPECVaR), that evaluates exactly stationary policies of CVaR-SSPs with non-uniform costs. We evaluate empirically CVaR Value Iteration algorithms that found solutions approximately regarding their quality compared with the exact solution, and the influence of the algorithm parameters in the quality and scalability of the solutions. Experiments in two domains show that it is important to use an α_0 smaller than the α target and an adequate number of atoms to obtain a good approximation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2021

Risk-Averse Stochastic Shortest Path Planning

We consider the stochastic shortest path planning problem in MDPs, i.e.,...
research
03/25/2012

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

We consider infinite-horizon γ-discounted Markov Decision Processes, for...
research
01/26/2023

Multi-Agent Congestion Cost Minimization With Linear Function Approximations

This work considers multiple agents traversing a network from a source n...
research
03/03/2022

Risk-aware Stochastic Shortest Path

We treat the problem of risk-aware control for stochastic shortest path ...
research
10/05/2021

Continuous-Time Fitted Value Iteration for Robust Policies

Solving the Hamilton-Jacobi-Bellman equation is important in many domain...
research
05/12/2014

Policy Gradients for CVaR-Constrained MDPs

We study a risk-constrained version of the stochastic shortest path (SSP...
research
08/23/2023

SafeAR: Towards Safer Algorithmic Recourse by Risk-Aware Policies

With the growing use of machine learning (ML) models in critical domains...

Please sign up or login with your details

Forgot password? Click here to reset