Learning Partially Observable Deterministic Action Models

01/15/2014
by   Eyal Amir, et al.
0

We present exact algorithms for identifying deterministic-actions effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model(the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AI-planning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventure-game playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis.

READ FULL TEXT
research
09/13/2021

Learning to Act and Observe in Partially Observable Domains

We consider a learning agent in a partially observable environment, with...
research
02/07/2020

Causally Correct Partial Models for Reinforcement Learning

In reinforcement learning, we can learn a model of future observations a...
research
03/28/2017

Inverse Reinforcement Learning from Incomplete Observation Data

Inverse reinforcement learning (IRL) aims to explain observed strategic ...
research
05/09/2012

Deterministic POMDPs Revisited

We study a subclass of POMDPs, called Deterministic POMDPs, that is char...
research
04/17/2019

PLOTS: Procedure Learning from Observations using Subtask Structure

In many cases an intelligent agent may want to learn how to mimic a sing...
research
11/12/2019

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Learning and planning in partially-observable domains is one of the most...
research
01/16/2014

Efficient Planning under Uncertainty with Macro-actions

Deciding how to act in partially observable environments remains an acti...

Please sign up or login with your details

Forgot password? Click here to reset