Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations

09/17/2021
by   Ke Sun, et al.
7

In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training. In this paper, we study the training robustness of distributional Reinforcement Learning (RL), a class of state-of-the-art methods that estimate the whole distribution, as opposed to only the expectation, of the total return. Firstly, we propose State-Noisy Markov Decision Process (SN-MDP) in the tabular case to incorporate both random and adversarial state observation noises, in which the contraction of both expectation-based and distributional Bellman operators is derived. Beyond SN-MDP with the function approximation, we theoretically characterize the bounded gradient norm of histogram-based distributional loss, accounting for the better training robustness of distribution RL. We also provide stricter convergence conditions of the Temporal-Difference (TD) learning under more flexible state noises, as well as the sensitivity analysis by the leverage of influence function. Finally, extensive experiments on the suite of games show that distributional RL enjoys better training robustness compared with its expectation-based counterpart across various state observation noises.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2021

Robustness and risk management via distributional dynamic programming

In dynamic programming (DP) and reinforcement learning (RL), an agent le...
research
02/01/2022

Distributional Reinforcement Learning via Sinkhorn Iterations

Distributional reinforcement learning (RL) is a class of state-of-the-ar...
research
10/07/2021

Towards Understanding Distributional Reinforcement Learning: Regularization, Optimization, Acceleration and Sinkhorn Algorithm

Distributional reinforcement learning (RL) is a class of state-of-the-ar...
research
09/29/2022

How Does Value Distribution in Distributional Reinforcement Learning Help Optimization?

We consider the problem of learning a set of probability distributions f...
research
05/13/2018

GAN Q-learning

Distributional reinforcement learning (distributional RL) has seen empir...
research
10/21/2022

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables

One key challenge for multi-task Reinforcement learning (RL) in practice...
research
05/25/2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

While distributional reinforcement learning (RL) has demonstrated empiri...

Please sign up or login with your details

Forgot password? Click here to reset