Answer Set Programming for Non-Stationary Markov Decision Processes

05/03/2017
by   Leonardo A. Ferreira, et al.
0

Non-stationary domains, where unforeseen changes happen, present a challenge for agents to find an optimal policy for a sequential decision making problem. This work investigates a solution to this problem that combines Markov Decision Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming (ASP) in a method we call ASP(RL). In this method, Answer Set Programming is used to find the possible trajectories of an MDP, from where Reinforcement Learning is applied to learn the optimal policy of the problem. Results show that ASP(RL) is capable of efficiently finding the optimal solution of an MDP representing non-stationary domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2017

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

Non-stationary domains, that change in unpredicted ways, are a challenge...
research
02/16/2019

Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial Puzzles

Spatial puzzles composed of rigid objects, flexible strings and holes of...
research
06/09/2023

Robust Reinforcement Learning via Adversarial Kernel Approximation

Robust Markov Decision Processes (RMDPs) provide a framework for sequent...
research
09/26/2022

Delayed Geometric Discounts: An Alternative Criterion for Reinforcement Learning

The endeavor of artificial intelligence (AI) is to design autonomous age...
research
12/21/2019

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

Markov Decision Process (MDP) problems can be solved using Dynamic Progr...
research
04/03/2023

Action Pick-up in Dynamic Action Space Reinforcement Learning

Most reinforcement learning algorithms are based on a key assumption tha...
research
08/18/2023

A Robust Policy Bootstrapping Algorithm for Multi-objective Reinforcement Learning in Non-stationary Environments

Multi-objective Markov decision processes are a special kind of multi-ob...

Please sign up or login with your details

Forgot password? Click here to reset