Comparing discounted and average-cost Markov Decision Processes: a statistical significance perspective

12/01/2021

∙

Optimal Markov Decision Process policies for problems with finite state and action space are identified through a partial ordering by comparing the value function across states. This is referred to as state-based optimality. This paper identifies when such optimality guarantees some form of system-based optimality as measured by a scalar. Four such system-based metrics are introduced. Uni-variate empirical distributions of these metrics are obtained through simulation as to assess whether theoretically optimal policies provide a statistically significant advantage. This has been conducted using a Student's t-test, Welch's t-test and a Mann-Whitney U-test. The proposed method is applied to a common problem in queuing theory: admission control.

READ FULL TEXT

Comparing discounted and average-cost Markov Decision Processes: a statistical significance perspective

Sign in with Google

Consider DeepAI Pro