Existence and Finiteness Conditions for Risk-Sensitive Planning: Results and Conjectures
Decision-theoretic planning with risk-sensitive planning objectives is important for building autonomous agents or decision-support systems for real-world applications. However, this line of research has been largely ignored in the artificial intelligence and operations research communities since planning with risk-sensitive planning objectives is more complicated than planning with risk-neutral planning objectives. To remedy this situation, we derive conditions that guarantee that the optimal expected utilities of the total plan-execution reward exist and are finite for fully observable Markov decision process models with non-linear utility functions. In case of Markov decision process models with both positive and negative rewards, most of our results hold for stationary policies only, but we conjecture that they can be generalized to non stationary policies.
READ FULL TEXT