Faster Approximate Dynamic Programming by Freezing Slow States

01/03/2023
by   Yijia Wang, et al.
0

We consider infinite horizon Markov decision processes (MDPs) with fast-slow structure, meaning that certain parts of the state space move "fast" (and in a sense, are more influential) while other parts transition more "slowly." Such structure is common in real-world problems where sequential decisions need to be made at high frequencies, yet information that varies at a slower timescale also influences the optimal policy. Examples include: (1) service allocation for a multi-class queue with (slowly varying) stochastic costs, (2) a restless multi-armed bandit with an environmental state, and (3) energy demand response, where both day-ahead and real-time prices play a role in the firm's revenue. Models that fully capture these problems often result in MDPs with large state spaces and large effective time horizons (due to frequent decisions), rendering them computationally intractable. We propose an approximate dynamic programming algorithmic framework based on the idea of "freezing" the slow states, solving a set of simpler finite-horizon MDPs (the lower-level MDPs), and applying value iteration (VI) to an auxiliary MDP that transitions on a slower timescale (the upper-level MDP). We also extend the technique to a function approximation setting, where a feature-based linear architecture is used. On the theoretical side, we analyze the regret incurred by each variant of our frozen-state approach. Finally, we give empirical evidence that the frozen-state approach generates effective policies using just a fraction of the computational cost, while illustrating that simply omitting slow states from the decision modeling is often not a viable heuristic.

READ FULL TEXT

page 35

page 38

research
01/23/2013

SPUDD: Stochastic Planning using Decision Diagrams

Markov decisions processes (MDPs) are becoming increasing popular as mod...
research
02/27/2023

Optimistic Planning by Regularized Dynamic Programming

We propose a new method for optimistic planning in infinite-horizon disc...
research
07/23/2021

An Adaptive State Aggregation Algorithm for Markov Decision Processes

Value iteration is a well-known method of solving Markov Decision Proces...
research
09/26/2019

Action Selection for MDPs: Anytime AO* vs. UCT

In the presence of non-admissible heuristics, A* and other best-first al...
research
06/09/2011

Efficient Solution Algorithms for Factored MDPs

This paper addresses the problem of planning under uncertainty in large ...
research
02/06/2013

Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes

We present a method for solving implicit (factored) Markov decision proc...
research
09/02/2021

Optimal Path Planning of Autonomous Marine Vehicles in Stochastic Dynamic Ocean Flows using a GPU-Accelerated Algorithm

Autonomous marine vehicles play an essential role in many ocean science ...

Please sign up or login with your details

Forgot password? Click here to reset