Byzantine-Robust Online and Offline Distributed Reinforcement Learning

06/01/2022
by   Yiding Chen, et al.
0

We consider a distributed reinforcement learning setting where multiple agents separately explore the environment and communicate their experiences through a central server. However, α-fraction of agents are adversarial and can report arbitrary fake information. Critically, these adversarial agents can collude and their fake data can be of any sizes. We desire to robustly identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents. Our main technical contribution is Weighted-Clique, a novel algorithm for the robust mean estimation from batches problem, that can handle arbitrary batch sizes. Building upon this new estimator, in the offline setting, we design a Byzantine-robust distributed pessimistic value iteration algorithm; in the online setting, we design a Byzantine-robust distributed optimistic value iteration algorithm. Both algorithms obtain near-optimal sample complexities and achieve superior robustness guarantee than prior works.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Corruption-Robust Offline Reinforcement Learning

We study the adversarial robustness in offline reinforcement learning. G...
research
06/14/2018

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

In this paper, we study robust large-scale distributed learning in the p...
research
08/11/2022

Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity

This paper concerns the central issues of model robustness and sample ef...
research
03/04/2021

Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

This paper develops an efficient distributed inference algorithm, which ...
research
07/15/2023

Byzantine-robust distributed one-step estimation

This paper proposes a Robust One-Step Estimator(ROSE) to solve the Byzan...
research
03/17/2021

Escaping Saddle Points in Distributed Newton's Method with Communication efficiency and Byzantine Resilience

We study the problem of optimizing a non-convex loss function (with sadd...
research
06/09/2022

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

In real-world reinforcement learning applications the learner's observat...

Please sign up or login with your details

Forgot password? Click here to reset