Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision Processes

06/03/2020
by   Junhong Xu, et al.
9

We propose a principled kernel-based policy iteration algorithm to solve the continuous-state Markov Decision Processes (MDPs). In contrast to most decision-theoretic planning frameworks, which assume fully known state transition models, we design a method that eliminates such a strong assumption, which is oftentimes extremely difficult to engineer in reality. To achieve this, we first apply the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of value function, we then design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We have validated the proposed method through extensive simulations in both simplified and realistic planning scenarios, and the experiments show that our proposed approach leads to a much superior performance over several baseline methods.

READ FULL TEXT

page 1

page 5

page 6

page 7

research
03/03/2019

State-Continuity Approximation of Markov Decision Processes via Finite Element Analysis for Autonomous System Planning

Motion planning under uncertainty for an autonomous system can be formul...
research
06/12/2022

Geometric Policy Iteration for Markov Decision Processes

Recently discovered polyhedral structures of the value function for fini...
research
09/17/2013

Models and algorithms for skip-free Markov decision processes on trees

We introduce a class of models for multidimensional control problems whi...
research
12/07/2022

Tight Performance Guarantees of Imitator Policies with Continuous Actions

Behavioral Cloning (BC) aims at learning a policy that mimics the behavi...
research
09/18/2020

Low-rank MDP Approximation via Moment Coupling

We propose a novel method—based on local moment matching—to approximate ...
research
12/01/2021

Comparing discounted and average-cost Markov Decision Processes: a statistical significance perspective

Optimal Markov Decision Process policies for problems with finite state ...
research
10/21/2022

online and lightweight kernel-based approximated policy iteration for dynamic p-norm linear adaptive filtering

This paper introduces a solution to the problem of selecting dynamically...

Please sign up or login with your details

Forgot password? Click here to reset