PAC Guarantees for Concurrent Reinforcement Learning with Restricted Communication

05/23/2019
by   Or Raveh, et al.
0

We develop model free PAC performance guarantees for multiple concurrent MDPs, extending recent works where a single learner interacts with multiple non-interacting agents in a noise free environment. Our framework allows noisy and resource limited communication between agents, and develops novel PAC guarantees in this extended setting. By allowing communication between the agents themselves, we suggest improved PAC-exploration algorithms that can overcome the communication noise and lead to improved sample complexity bounds. We provide a theoretically motivated algorithm that optimally combines information from the resource limited agents, thereby analyzing the interaction between noise and communication constraints that are ubiquitous in real-world systems. We present empirical results for a simple task that supports our theoretical formulations and improve upon naive information fusion methods.

READ FULL TEXT
research
12/19/2020

Communication-Aware Collaborative Learning

Algorithms for noiseless collaborative PAC learning have been analyzed a...
research
03/22/2017

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Statistical performance bounds for reinforcement learning (RL) algorithm...
research
09/05/2020

A Hybrid PAC Reinforcement Learning Algorithm

This paper offers a new hybrid probably asymptotically correct (PAC) rei...
research
09/23/2019

PAC Reinforcement Learning without Real-World Feedback

This work studies reinforcement learning in the Sim-to-Real setting, in ...
research
02/12/2019

Crowdsourced PAC Learning under Classification Noise

In this paper, we analyze PAC learnability from labels produced by crowd...
research
09/08/2016

Ms. Pac-Man Versus Ghost Team CIG 2016 Competition

This paper introduces the revival of the popular Ms. Pac-Man Versus Ghos...
research
10/18/2018

On Statistical Learning of Simplices: Unmixing Problem Revisited

Learning of high-dimensional simplices from uniformly-sampled observatio...

Please sign up or login with your details

Forgot password? Click here to reset