Transition Tensor Markov Decision Processes: Analyzing Shot Policies in Professional Basketball
In this paper we model basketball plays as episodes from team-specific non-stationary Markov decision processes (MDPs) with shot clock dependent transition probability tensors. Bayesian hierarchical models are employed in the modeling and parametrization of these tensors to borrow strength across players and through time. To enable computational feasibility, we combine lineup-specific MDPs into team-average MDPs using a novel transition weighting scheme. Specifically, we derive the dynamics of the team-average process such that the expected transition count for an arbitrary state-pair is equal to the weighted sum of the expected counts of the separate lineup-specific MDPs. We then utilize these non-stationary MDPs in the creation of a basketball play simulator with uncertainty propagated via posterior samples of the model components. After calibration, we simulate seasons both on-policy and under altered policies and explore the net changes in efficiency and production under the alternate policies. Additionally, we discuss the game-theoretic ramifications of testing alternative decision policies.
READ FULL TEXT