Online Learning and Planning in Partially Observable Domains without Prior Knowledge

06/11/2019
by   Yunlong Liu, et al.
0

How an agent can act optimally in stochastic, partially observable domains is a challenge problem, the standard approach to address this issue is to learn the domain model firstly and then based on the learned model to find the (near) optimal policy. However, offline learning the model often needs to store the entire training data and cannot utilize the data generated in the planning phase. Furthermore, current research usually assumes the learned model is accurate or presupposes knowledge of the nature of the unobservable part of the world. In this paper, for systems with discrete settings, with the benefits of Predictive State Representations (PSRs), a model-based planning approach is proposed where the learning and planning phases can both be executed online and no prior knowledge of the underlying system is required. Experimental results show compared to the state-of-the-art approaches, our algorithm achieved a high level of performance with no prior knowledge provided, along with theoretical advantages of PSRs. Source code is available at https://github.com/DMU-XMU/PSR-MCTS-Online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2019

Combining Offline Models and Online Monte-Carlo Tree Search for Planning from Scratch

Planning in stochastic and partially observable environments is a centra...
research
09/12/2016

DESPOT: Online POMDP Planning with Regularization

The partially observable Markov decision process (POMDP) provides a prin...
research
07/18/2022

Prior Knowledge Guided Unsupervised Domain Adaptation

The waive of labels in the target domain makes Unsupervised Domain Adapt...
research
06/22/2022

POGEMA: Partially Observable Grid Environment for Multiple Agents

We introduce POGEMA (https://github.com/AIRI-Institute/pogema) a sandbox...
research
12/12/2009

Closing the Learning-Planning Loop with Predictive State Representations

A central problem in artificial intelligence is that of planning to maxi...
research
01/17/2023

Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Open Information Extraction models have shown promising results with suf...
research
06/02/2023

Deep Reinforcement Learning Framework for Thoracic Diseases Classification via Prior Knowledge Guidance

The chest X-ray is often utilized for diagnosing common thoracic disease...

Please sign up or login with your details

Forgot password? Click here to reset