DeepAI AI Chat
Log In Sign Up

Soft-Robust Algorithms for Handling Model Misspecification

11/30/2020
by   Elita A. Lobo, et al.
0

In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the percentile criterion, which minimizes the probability of a catastrophic failure. Unfortunately, such policies are typically overly conservative as the percentile criterion is non-convex, difficult to optimize, and ignores the mean performance. To overcome these shortcomings, we study the soft-robust criterion, which uses risk measures to balance the mean and percentile criteria better. In this paper, we establish the soft-robust criterion's fundamental properties, show that it is NP-hard to optimize, and propose and analyze two algorithms to optimize it approximately. Our theoretical analyses and empirical evaluations demonstrate that our algorithms compute much less conservative solutions than the existing approximate methods for optimizing the percentile-criterion.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/03/2016

Quantile Reinforcement Learning

In reinforcement learning, the standard criterion to evaluate policies i...
11/15/2018

Tight Bayesian Ambiguity Sets for Robust MDPs

Robustness is important for sequential decision making in a stochastic d...
02/27/2020

Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes

Markov decision processes (MDPs) are the defacto frame-work for sequenti...
05/09/2023

Distributional Multi-Objective Decision Making

For effective decision support in scenarios with conflicting objectives,...
02/27/2021

Revisiting Peng's Q(λ) for Modern Reinforcement Learning

Off-policy multi-step reinforcement learning algorithms consist of conse...
09/09/2022

RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk

Prior work on safe Reinforcement Learning (RL) has studied risk-aversion...
08/12/2020

Graph Drawing via Gradient Descent, (GD)^2

Readability criteria, such as distance or neighborhood preservation, are...