Forward and Backward Bellman equations improve the efficiency of EM algorithm for DEC-POMDP

03/19/2021
by   Takehiro Tottori, et al.
0

Decentralized partially observable Markov decision process (DEC-POMDP) models sequential decision making problems by a team of agents. Since the planning of DEC-POMDP can be interpreted as the maximum likelihood estimation for the latent variable model, DEC-POMDP can be solved by the EM algorithm. However, in EM for DEC-POMDP, the forward–backward algorithm needs to be calculated up to the infinite horizon, which impairs the computational efficiency. In this paper, we propose the Bellman EM algorithm (BEM) and the modified Bellman EM algorithm (MBEM) by introducing the forward and backward Bellman equations into EM. BEM can be more efficient than EM because BEM calculates the forward and backward Bellman equations instead of the forward–backward algorithm up to the infinite horizon. However, BEM cannot always be more efficient than EM when the size of problems is large because BEM calculates an inverse matrix. We circumvent this shortcoming in MBEM by calculating the forward and backward Bellman equations without the inverse matrix. Our numerical experiments demonstrate that the convergence of MBEM is faster than that of EM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2016

EM Algorithm and Stochastic Control in Economics

Generalising the idea of the classical EM algorithm that is widely used ...
research
06/05/2016

Relaxation of the EM Algorithm via Quantum Annealing

The EM algorithm is a novel numerical method to obtain maximum likelihoo...
research
03/28/2017

Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry

We provide a comprehensive study of the convergence of forward-backward ...
research
02/14/2019

Geometry of Arimoto Algorithm

This paper aims to reveal information geometric structure of Arimoto alg...
research
11/09/2022

Limit theorems for forward and backward processes of numbers of non-empty urns in infinite urn schemes

We study the joint asymptotics of forward and backward processes of numb...
research
06/25/2018

Learning dynamical systems with particle stochastic approximation EM

We present the particle stochastic approximation EM (PSAEM) algorithm fo...
research
04/17/2022

Initial state reconstruction on graphs

The presence of noise is an intrinsic problem in acquisition processes f...

Please sign up or login with your details

Forgot password? Click here to reset