Zero-Sum Semi-Markov Games with State-Action-Dependent Discount Factors
Semi-Markov model is one of the most general models for stochastic dynamic systems. This paper deals with a two-person zero-sum game for semi-Markov processes. We focus on the expected discounted payoff criterion with state-action-dependent discount factors. The state and action spaces are both Polish spaces, and the payoff function is ω-bounded. We first construct a fairly general model of semi-Markov games under a given semi-Markov kernel and a pair of strategies. Next, based on the standard regularity condition and the continuity-compactness condition for semi-Markov games, we derive a "drift condition" on the semi-Markov kernel and suppose that the discount factors have a positive lower bound, under which the existence of the value function and a pair of optimal stationary strategies of our semi-Markov game are proved by using the Shapley equation. Moreover, when the state and action spaces are both finite, a value iteration-type algorithm for computing the value function and ε-Nash equilibrium of the game is developed. The convergence of the algorithm is also proved. Finally, we conduct numerical examples to demonstrate our main results.
READ FULL TEXT