Online Learning for Unknown Partially Observable MDPs

02/25/2021
by   Mehdi Jafarnia-Jahromi, et al.
0

Solving Partially Observable Markov Decision Processes (POMDPs) is hard. Learning optimal controllers for POMDPs when the model is unknown is harder. Online learning of optimal controllers for unknown POMDPs, which requires efficient learning using regret-minimizing algorithms that effectively tradeoff exploration and exploitation, is even harder, and no solution exists currently. In this paper, we consider infinite-horizon average-cost POMDPs with unknown transition model, though known observation model. We propose a natural posterior sampling-based reinforcement learning algorithm (POMDP-PSRL) and show that it achieves O(T^2/3) regret where T is the time horizon. To the best of our knowledge, this is the first online RL algorithm for POMDPs and has sub-linear regret.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2021

Sublinear Regret for Learning POMDPs

We study the model-based undiscounted reinforcement learning for partial...
research
08/03/2009

Regret Bounds for Opportunistic Channel Access

We consider the task of opportunistic channel access in a primary system...
research
08/04/2023

Learning Optimal Admission Control in Partially Observable Queueing Networks

We present an efficient reinforcement learning algorithm that learns the...
research
12/06/2021

Lecture Notes on Partially Known MDPs

In these notes we will tackle the problem of finding optimal policies fo...
research
10/17/2022

Regret Bounds for Learning Decentralized Linear Quadratic Regulator with Partially Nested Information Structure

We study the problem of learning decentralized linear quadratic regulato...
research
07/10/2022

Learning to Order for Inventory Systems with Lost Sales and Uncertain Supplies

We consider a stochastic lost-sales inventory control system with a lead...
research
01/23/2019

Learning to Collaborate in Markov Decision Processes

We consider a two-agent MDP framework where agents repeatedly solve a ta...

Please sign up or login with your details

Forgot password? Click here to reset