Decision Making Agent Searching for Markov Models in Near-Deterministic World

02/27/2011
by   Gabor Matuz, et al.
0

Reinforcement learning has solid foundations, but becomes inefficient in partially observed (non-Markovian) environments. Thus, a learning agent -born with a representation and a policy- might wish to investigate to what extent the Markov property holds. We propose a learning architecture that utilizes combinatorial policy optimization to overcome non-Markovity and to develop efficient behaviors, which are easy to inherit, tests the Markov property of the behavioral states, and corrects against non-Markovity by running a deterministic factored Finite State Model, which can be learned. We illustrate the properties of architecture in the near deterministic Ms. Pac-Man game. We analyze the architecture from the point of view of evolutionary, individual, and social learning.

READ FULL TEXT

page 8

page 13

research
04/29/2022

Markov Abstractions for PAC Reinforcement Learning in Non-Markov Decision Processes

Our work aims at developing reinforcement learning algorithms that do no...
research
01/04/2019

Optimal Decision-Making in Mixed-Agent Partially Observable Stochastic Environments via Reinforcement Learning

Optimal decision making with limited or no information in stochastic env...
research
03/17/2022

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

In probably approximately correct (PAC) reinforcement learning (RL), an ...
research
01/16/2014

Non-Deterministic Policies in Markovian Decision Processes

Markovian processes have long been used to model stochastic environments...
research
10/18/2021

Empirical Policy Optimization for n-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its pol...
research
02/28/2017

Analysis of Agent Expertise in Ms. Pac-Man using Value-of-Information-based Policies

Conventional reinforcement learning methods for Markov decision processe...
research
08/06/2019

A stochastic game theory approach for the prediction of interfacial parameters in two-phase flow systems

The prediction of interfacial area properties in two-phase flow systems ...

Please sign up or login with your details

Forgot password? Click here to reset