Robust Reinforcement Learning via Adversarial Kernel Approximation

06/09/2023
by   Kaixin Wang, et al.
0

Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-making that is robust to perturbations on the transition kernel. However, robust reinforcement learning (RL) approaches in RMDPs do not scale well to realistic online settings with high-dimensional domains. By characterizing the adversarial kernel in RMDPs, we propose a novel approach for online robust RL that approximates the adversarial kernel and uses a standard (non-robust) RL algorithm to learn a robust policy. Notably, our approach can be applied on top of any underlying RL algorithm, enabling easy scaling to high-dimensional domains. Experiments in classic control tasks, MinAtar and DeepMind Control Suite demonstrate the effectiveness and the applicability of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2017

Answer Set Programming for Non-Stationary Markov Decision Processes

Non-stationary domains, where unforeseen changes happen, present a chall...
research
11/30/2021

Model-Free μ Synthesis via Adversarial Reinforcement Learning

Motivated by the recent empirical success of policy-based reinforcement ...
research
09/28/2022

Online Policy Optimization for Robust MDP

Reinforcement learning (RL) has exceeded human performance in many synth...
research
07/15/2023

Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation

Robustness has been extensively studied in reinforcement learning (RL) t...
research
04/16/2020

Analyzing Reinforcement Learning Benchmarks with Random Weight Guessing

We propose a novel method for analyzing and visualizing the complexity o...
research
12/28/2022

Certifying Safety in Reinforcement Learning under Adversarial Perturbation Attacks

Function approximation has enabled remarkable advances in applying reinf...
research
05/22/2022

Power and accountability in reinforcement learning applications to environmental policy

Machine learning (ML) methods already permeate environmental decision-ma...

Please sign up or login with your details

Forgot password? Click here to reset