1 Introduction
The BradleyTerry model [3, 18]
has long been used for evaluating paired comparisons, such as games between pairs of teams in which one team or the other wins each game. The model assigns a strength parameter to each team, and the odds ratio associated with the probability of a team winning a game is equal to the ratio of the strengths. These strength parameters can be estimated based on the full set of game results and used to rank teams or make future predictions. The model has been extended by Davidson
[7] to contests in which a tie or drawn contest is a possible outcome. For many years, such ties were a common occurrence in the sport of ice hockey, but recently tiebreaking methods such as an overtime period played under different rules and/or a shootout in which the teams alternate penalty shots are used to determine a winner. Results in overtime or shootouts can be evaluated differently from wins in regulation play. For instance, since 2006[8] competitions organized by International Ice Hockey Federation (IIHF) have awarded three points in the standings to a team winning in regulation, two points for a win in overtime or a shootout, one point for a loss in overtime or a shootout, and no points for a loss in regulation, and many leagues have followed suit. Compared to the prior system which awarded two points for a win, one for a tie, and none for a loss, which effectively treated a tie as half a win and half a loss, the fouroutcome system treats an overtime/shootout win as of a win and of a loss.^{1}^{1}1We do not consider nonzerosum point systems such as that used in association football (soccer) which awards three points for a win and one for a draw, so that drawn matches are only worth two points total rather than three. Likewise, the National Hockey League awards all wins two points and overtime/shootout losses one point; this 2210 system awards three total points for games which go into overtime, but only two for games decided in regulation. One possible approach to either of these situations (games with three or four outcomes) is to use standard BradleyTerry and assign fractional wins as appropriate to the point system (see, e.g., [17]). However, this is unsatisfying, as it provides no way to assign a probability for a future game to end in a tie or overtime or shootout result. In this paper, we instead consider a generalization of the tie model of [7] which associates one strength parameter to each team, along with a single parameter describing the tendency for games to go into overtime.The rest of this paper is organized as follows: In Section 2, we describe the three models (standard BradleyTerry, BradleyTerryDavidson including ties, and a new model with four possible game outcomes including overtime/shootout wins and losses), and exhibit a generalization of the relevant formulas which describes all three cases. In Section 3
we describe methods for inferring the relevant parameters of these models given a set of game results: maximum likelihood estimation, and Bayesian inference using either a Gaussian approximation or Hamiltonian Monte Carlo. In Section
4 we demonstrate these methods using a recent set of game results: the 20202021 Eastern College Athletic Conference (ECAC) season. This season used the standard IIHF system with 3210 points assigned for regulation wins, overtime/shootout wins, overtime/shootout losses, and regulation losses, respectively. For the purposes of illustration, we evaluate the ECAC results with the fouroutcome model as well as with the other two models, treating in one case all wins the same, and in the other all overtime/shootout results as ties.2 Models
In the standard BradleyTerry model [3, 18] each team has a strength , and the modelled probability that team will win a game with team is
(2.1) 
so that the probability of a set of game outcomes in which team plays team times and wins of those games is^{2}^{2}2The first form explicitly includes each pair of teams only once, while the second corrects for the doublecounting, taking advantage of the fact that . If the order of the games between pairs of teams is ignored, the sampling distribution for the is instead .
(2.2) 
where is the number of teams, and .
Davidson [7] proposed an extension for competitions which include the probabilities of ties, in which the probabilities of the three possible outcomes of a game are
(2.3a)  
(2.3b)  
(2.3c) 
where is an additional parameter which describes how likely ties are to occur. (The probability of a tie in a game between evenly matched teams is .) Evidently, , and . The probability of a given set of game outcomes in which the games between teams and result in wins, ties and losses for team (where and ) is
(2.4) 
We propose an extension appropriate for a system in which a win in overtime or a shootout is treated as of a win and of a loss, Writing the four possible game outcomes as RW for regulation win, OW for overtime/shootout win, OL for overtime/shootout loss, and RL for regulation loss, the modelled probability of each outcome would be^{3}^{3}3The exponents are chosen to correspond to the share of the points ( and , respectively) awarded for an overtime/shootout win or loss. This has the desirable feature that the maximum likelihood equation (3.7) becomes (after multiplying by )
(2.5a)  
(2.5b)  
(2.5c)  
(2.5d) 
The probability for a set of game outcomes will then be
(2.6) 
If we write and , we can describe all three models as special cases of a general model in which the probability of a game between teams and ending in outcome is
(2.7) 
where
(2.8) 
is a vector equivalent of the logistic function known as the softmax function.
[4] The probability for a set of game outcomes is(2.9) 
Specifically,

For the standard BradleyTerry model, , , and .

For the BradleyTerryDavidson model with ties, , , , , and .

For the model introduced in this paper, , , , , , and .
All of these models satisfy , and have “opposite” outcomes and such that , , and . They also satisfy and . We confine ourselves below to cases where these properties hold.
3 Inference of Parameters
3.1 Maximum Likelihood
Maximum likelihood estimates (MLEs) of BradleyTerry strength parameters [18, 9, 7] provide a straightforward way of associating a “rating” to each team based on their game results, and have been proposed as a replacement for less reliable ways of evaluating a team’s game results in light of the difficulty of their schedule.[5]
We can consider the probability as a likelihood function of the parameters and , with loglikelihood
(3.1) 
We can use the identity
(3.2) 
to show that
(3.3a)  
and  
(3.3b) 
which means that
(3.4) 
and
(3.5) 
Using these, we can write the maximum likelihood equations as
(3.6) 
and
(3.7) 
where
(3.8) 
can be interpreted in the models considered as the number of games which are tied or go to overtime, respectively, and
(3.9) 
can be seen as the total number of “points” for team . The maximum likelihood equation set each of these quantities equal to their expectation values.
We can solve the maximum likelihood equations by a generalization of the iterative method in [9]. writing them
(3.10) 
(where we have used the fact that the only nonzero term in the numerator has ) and
(3.11) 
As in the standard BradleyTerry model, the overall multiplicative scale of is undefined (because can be written so that the team strengths appear only in the combination ), so it is necessary to rescale the team strengths at each iteration to preserve a property such as . Beyond that, there are conditions for the maximum likelihood estimates to be finite and welldefined, which are explored in e.g., [2, 14, 6].
3.2 Bayesian Approach
It is useful to move beyond maximum likelihood estimates, both to quantify uncertainty in the model parameters, and to make predictions about the outcome of future games. (For instance, [17] proposed simulating future games with probabilites drawn from a posterior distribution capturing the uncertainty in the strength parameters, rather than fixed probabilties generated from the MLEs of those parameters.)
A convenient framework for parameter estimates including uncertainties is Bayesian inference, which defines the posterior probability density for the parameters
and , or equivalently and , given a set of game results and prior assumptions , as(3.12) 
A variety of choices can be made for the multivariate prior distribution on [16] in the BradleyTerry model, and likewise for the tie/overtime parameter . For simplicity, we work in this paper with the improper Haldane prior^{4}^{4}4So named because the marginal prior distribution for probabilities such as will follow the Haldane prior [10, 11], which is the limit of a distribution as .
(3.13) 
which means that the posterior distribution is proportional to the likelihood:
(3.14) 
With this choice of prior, the posterior probability density will be independent of the combination , but otherwise will be normalizable under the same circumstances that lead to welldefined maximum likelihood estimates for the parameters.
3.2.1 Gaussian Approximation
One convenient approach is to Taylor expand the logposterior about the maximum a posteriori solution (which in this case is the maximum likelihood solution ).^{5}^{5}5
Note that this method does not assign special significance to the MAP estimates, but uses them as the starting point for a convenient approximation to the posterior probability distribution.
Truncating the expansion at second order gives a Gaussian approximation(3.15) 
where is the Hessian matrix
(3.16) 
with elements^{6}^{6}6Note the similarity to the Fisher information matrix , which differs from the Hessian in that that depends on the observed data, while is a function defined on parameter space.
(3.17a)  
(3.17b)  
(3.17c) 
To compute the elements of the Hessian matrix, we return to the first derivative (3.4) and differentiate them to get
(3.18) 
where
(3.19) 
is the probability of a tie or overtime game, depending on the model, and we have used the fact that since . Similarly, using the properties , , , and , we find
(3.20) 
and, finally, differentiating (3.5) gives us
(3.21) 
so that the Hessian matrix has components
(3.22a)  
(3.22b)  
(3.22c) 
Note that in the case of the BradleyTerry model, where the only outcomes are win and loss, the condition simplifies the Hessian to (since the parameter is not actually part of the likelihood), and
(3.23) 
which is the form seen in, e.g., [17].
The Hessian matrix in (3.22) is singular, since and , which ultimately arise from the fact that the probabilities , and thus the likelihood, are unchanged by adding the same constant to all the
. This can be handled computationally by constructing a variancecovariance matrix
which is the MoorePenrose pseudoinverse[12]^{7}^{7}7For a real symmetric matrix with a complete eigenvalue decomposition, this operation replaces each nonzero eigenvalue with its reciprocal while leaving zero eigenvalues unchanged.
of the Hessian matrix, and approximating the posterior as a multivariate Gaussian with a mean of and a variancecovariance matrix . This has the effect of enforcing the constraint(3.24) 
on the combination of the parameters which has no influence on the model.
This Gaussian approximation can be used to produce analytic estimates of quantities of interest, or used for Monte Carlo sampling, as illustrated in Section 4. It can also be used as a starting point for importance sampling of the sort discussed in [17]. For the present work, we consider a different Monte Carlo method for sampling from the exact posterior.
3.2.2 Hamiltonian Monte Carlo
Markovchain Monte Carlo methods provide a convenient way to draw samples from a posterior distribution. We demonstrate in this paper how to draw posterior samples for the BradleyTerry extensions considered, using Hamiltonian Monte Carlo as implemented in the Stan library.[15] There are a few technical considerations. Because the posterior on and
is improper, trying to draw from it directly will lead to chains which never converge. Any probabilities constructed from the samples will be wellbehaved, since only the meaningless degree of freedom
is unconstrained, but these apparent errors make it more difficult to detect other potential problems. It is thus useful instead to consider only variables (and ) which contribute to the probability model via (see (2.7))(3.25) 
Of course, the full set of values are not independent. Instead, they are determined by the parameters for . Given the we can construct .
In Appendix A we show the code of the Stan model used to perform Hamiltonian Monte Carlo simulations of all three models.
()  

Team  Cg  Ck  Qn  SL  RW  OW  OL  RL 
Colgate (Cg)  —  1(1)  1(0)  2(1)  4  2  3  9 
Clarkson (Ck)  3(1)  —  1(2)  1(0)  5  3  4  2 
Quinnipiac (Qn)  4(1)  1(2)  —  4(1)  9  4  2  3 
St. Lawrence (SL)  2(1)  0(1)  1(0)  —  3  2  2  7 
4 Demonstration Using Game Results
We now illustrate the application of the models described in this paper using game results from a competition which used the 3210 point system: the 20202021 Eastern College Athletic Conference (ECAC) season. While the league ordinarily plays a balanced roundrobin schedule in which each team plays each other team the same number of times, the season in question ended up being unbalanced due to cancellations of games arising from the COVID19 pandemic. In Table 1 we show the results for the ECAC season, in the form of and for each team against each opponent, along with the total number of results of each type for each team, .
4.1 ECAC: Standard BradleyTerry Model
As a first demonstration, we consider the standard BradleyTerry model applied to the ECAC results with regulation and overtime/shootout wins being counted as simply “wins” and regulation and overtime/shootout losses being counted as “losses”. I.e., we define and . The resulting maximumlikelihood solutions and associated probabilities are shown in Table 2, along with the uncertainties and correlations encoded in the variancecovariance matrix of the Gaussian approximation to the posterior distribution.
Team  Cg  Ck  Qn  SL  Cg  Ck  Qn  SL  

Cg  —  
Ck  —  
Qn  —  
SL  — 
Since the logstrengths have an arbitrary additive scale, a more meaningful understanding of the posterior distributions is obtained by considering the marginal distribution of the difference of a pair of team strengths . In Figure 1, we illustrate the maximum likelihood estimate and posterior distribution of this quantity for two of the six pairs of teams: QuinnipiacColgate and QuinnipiacClarkson. We show the posterior in Gaussian approximation (for which the marginal posterior on
is also a Gaussian), in a Monte Carlo drawn from the approximate multivariate Gaussian distribution, and in posterior samples from the exact posterior generated using Hamiltonian Monte Carlo with the Stan library.
[15] We can transform the posterior on a difference in logstrength into a posterior on the corresponding probability ; this is shown in Figure 2 for the two sets of posterior samples. In all cases, the exact marginal posterior, as estimated by the Hamiltonian Monte Carlo is only slightly different from the Gaussian approximation. This is similar to results found using importance sampling in [17].()  
Team  Cg  Ck  Qn  SL  
Cg  —  ()  ()  ()  
Ck  ()  —  ()  ()  
Qn  ()  ()  —  ()  
SL  ()  ()  ()  —  
Team  Cg  Ck  Qn  SL  

Cg  
Ck  
Qn  
SL  
4.2 ECAC: BradleyTerryDavidson Model with Ties
Moving on to the BradleyTerryDavidson model with ties, we now consider inference of the logstrength parameters along with the logtie parameter . We illustrate the methods by reanalyzing the 20202021 ECAC results, with all overtime games treated as ties, so that now , , and . The maximum likelihood solutions and are shown in Table 3, along with the associated probabilities for a win and for a tie in contests between pairs of teams. In Table 4, we show the maxumumlikelihood estimates along with the uncertainties in and correlations among the logstrengths and the logtie parameter , which are encoded in the variancecovariance matrix of the Gaussian approximation to the posterior distribution.
As with the standard BradleyTerry model, we can show the marginal posterior distributions on the differences between pairs of logstrength parameters, and we do this in Figure 3 for the same pairs of teams as before. Once again, samples drawn from the multivariate Gaussian approximation capture the shape of that distribution well, and samples drawn from the exact posterior using Hamiltonian Monte Carlo are slightly different but similar.
We cannot convert directly into a probability, however, since probabilities depend on the logtie parameter as well. In Figure 4 we plot the marginal posterior on . The parameter can be transformed into a probability where () for a game between evenlymatched teams to be tied, and we plot the posterior for this as well. Finally, in Figure 5 we illustrate the joint marginal posterior in and for our selected pairs of teams.
To illustrate the posterior on the probabilities for a pair of teams, we note that the constraint means that the space is actually two dimensional. The natural visualization for the behavior of three quantities which sum to one is a ternary plot, and we contour plot density estimates of the posterior and its Gaussian approximation in Figure 6, along with the maximum likelihood estimates .
4.3 ECAC: BradleyTerrylike Model with Overtime/Shootout Results
Having developed the mechanisms to characterize the posterior distribution for the BradleyTerryDavidson model with three outcomes (win, tie, and loss), we apply similar analogues for the model with four outcomes: regulation win (RW), overtime/shootout win (OW), overtime/shootout loss (OL), and regulation loss (RL), now applied to the full 20202021 ECAC results shown in Table 1. As before, there is a logstrength parameter for each team, and is now the log of a parameter associated with overtime results. We show the maximum likelihood estimates in Table 5 along with the probabilities for a regulation win for an overtime/shootout win in contests between pairs of teams. In Table 6 we show the parameters of the Gaussian approximation to the posterior.
()  
Team 
Comments
There are no comments yet.