DeepAI AI Chat
Log In Sign Up

Cross-Entropic Learning of a Machine for the Decision in a Partially Observable Universe

by   Frederic Dambreville, et al.

Revision of the paper previously entitled "Learning a Machine for the Decision in a Partially Observable Markov Universe" In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of hidden Markov models, with input and output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic principle for rare events simulation developed by Rubinstein.


page 1

page 2

page 3

page 4


Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process

The objective is to study an on-line Hidden Markov model (HMM) estimatio...

Reinforcement Learning of POMDPs using Spectral Methods

We propose a new reinforcement learning algorithm for partially observab...

The Partially Observable Hidden Markov Model and its Application to Keystroke Dynamics

The partially observable hidden Markov model is an extension of the hidd...

Learning classifier systems with memory condition to solve non-Markov problems

In the family of Learning Classifier Systems, the classifier system XCS ...

Hidden Markov Models and their Application for Predicting Failure Events

We show how Markov mixed membership models (MMMM) can be used to predict...