Sublinear Regret for Learning POMDPs

07/08/2021
by   Yi Xiong, et al.
7

We study the model-based undiscounted reinforcement learning for partially observable Markov decision processes (POMDPs). The oracle we consider is the optimal policy of the POMDP with a known environment in terms of the average reward over an infinite horizon. We propose a learning algorithm for this problem, building on spectral method-of-moments estimations for hidden Markov models, the belief error control in POMDPs and upper-confidence-bound methods for online learning. We establish a regret bound of O(T^2/3√(log T)) for the proposed learning algorithm where T is the learning horizon. This is, to the best of our knowledge, the first algorithm achieving sublinear regret with respect to our oracle for learning general POMDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2021

Online Learning for Unknown Partially Observable MDPs

Solving Partially Observable Markov Decision Processes (POMDPs) is hard....
research
08/04/2023

Learning Optimal Admission Control in Partially Observable Queueing Networks

We present an efficient reinforcement learning algorithm that learns the...
research
01/26/2020

Regime Switching Bandits

We study a multi-armed bandit problem where the rewards exhibit regime-s...
research
02/22/2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

In today's economy, it becomes important for Internet platforms to consi...
research
07/10/2022

Learning to Order for Inventory Systems with Lost Sales and Uncertain Supplies

We consider a stochastic lost-sales inventory control system with a lead...
research
02/25/2016

Reinforcement Learning of POMDPs using Spectral Methods

We propose a new reinforcement learning algorithm for partially observab...
research
05/07/2017

Experimental results : Reinforcement Learning of POMDPs using Spectral Methods

We propose a new reinforcement learning algorithm for partially observab...

Please sign up or login with your details

Forgot password? Click here to reset