Davidson-Luce model for multi-item choice with ties

09/16/2019 ∙ by David Firth, et al. ∙ University of Warwick 0

This paper introduces a natural extension of the pair-comparison-with-ties model of Davidson (1970, J. Amer. Statist. Assoc.), to allow for ties when more than two items are compared. Properties of the new model are discussed. It is found that this "Davidson-Luce" model retains the many appealing features of Davidson's solution, while extending the scope of application substantially beyond the domain of pair-comparison data. The model introduced here already underpins the handling of tied rankings in the "PlackettLuce" R package.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Background: Pair comparisons

1.1 Bradley-Terry model and Davidson’s generalization for ties

A commonly used statistical model for pair-comparison data is the so-called Bradley-Terry model (Bradley and Terry, 1952), in which a binary outcome ‘ is preferred to ’ or ‘ beats

’ is assumed to have probability in the form

In the Bradley-Terry model each ‘item’ (or ‘player’) has their own unobserved ‘strength’ or ‘ability’ , and it is the relative values of and that determine the win-probabilities when and are compared.

The Bradley-Terry model is a logit-linear model for the binary outcome (

wins, or wins); and the ratio is readily interpretable as the odds (on winning, in a contest between and ). The Bradley-Terry model has also been shown (Luce, 1959, 1977) to follow from a simple and appealing axiom for behaviour when making choices among items. For choosing the preferred item from a finite set , Luce’s axiom implies choice probabilities

from which the Bradley-Terry model follows whenever contains only two elements.

The Bradley-Terry model’s outcomes are strictly binary: ties are not permitted. Davidson (1970) shows how to generalize the Bradley-Terry model to accommodate ties, in a way that does not violate Luce’s axiom. The Davidson model stipulates, for the three possible outcomes { wins, wins, tie} in a comparison of items and , probabilities (summing to 1) as follows:

Outcome: wins wins tie
Probability (proportional to):

The Davidson model thus incorporates a single additional parameter, , which describes the prevalence of ties; different values of will be appropriate in different application contexts.

1.2 Properties of the Davidson model

Some well-known properties of the Davidson model are as follows:

  1. The geometric mean

    has the same dimension as and ; that is to say, their units of measurement are the same. This makes the tie-prevalence parameter dimensionless, and straightforwardly interpretable. Specifically, the probability of a tie in any comparison between items of equal strength (i.e., ) is .

  2. Conditional upon the outcome not being a tie, the probability that wins is , exactly as in the Bradley-Terry model for binary outcomes. In this way the Davidson model maintains compatibility with Luce’s axiom.

  3. Like the Bradley-Terry model, the Davidson generalization depends on the strengths only through their relative values. The scale — or unit of measurement — of strengths is immaterial.

  4. For any fixed value of , the tie probability is proportional to and is maximized when . That is, ties are most likely when the items being compared have equal strength.

  5. The Davidson model is a full exponential family model, and so maximum likelihood estimation of the parameters (the strengths

    and the tie prevalence ) simply equates sufficient statistics with their expectations under the model. The sufficient statistics are

    • for each item, its observed number of ‘wins’ plus half its observed number of ties;

    • the total number of ties seen, in all comparisons made.

    See, e.g., Fienberg (1979) for full details of the model’s representation in log-linear form, and consequent solution of the likelihood equations in standard software.

  6. The preceding property has a neat implication when the Davidson model is applied to a ‘balanced round-robin’ tournament among items, where every item is compared with every other item the same number of times. In that context the maximum likelihood estimates are ordered in exactly the same way as would be simple, item-specific ‘points totals’, with 2 points awarded for a win and 1 point for a tie (Davidson, 1970). This holds regardless of the value of .

2 More than two items: Davidson-Luce model

2.1 Preamble

In this section we extend the Davidson model to comparisons involving more than two items. The new ‘Davidson-Luce model’ is designed to retain the key properties of the Davidson model.

In general we will suppose that a choice is to be made (i.e., a winner is to be determined) from items. The outcome can be a single ‘best’ item, or a tie between two or more of the items under comparison.

The model is introduced first for , before giving its general definition for any finite .

2.2 Choice from three items

With three items , there are 7 possible outcomes. We will label these here as

  • , , (a single item wins)

  • , , (two items are tied winners)

  • (all three items are tied winners)

The Davidson-Luce model in this case specifies 7 probabilities that sum to 1, in the following proportions:

Outcome:
Probability (proportional to):

In this model there are two separate tie-prevalence parameters, and , for the prevalence of 2-way ties and 3-way ties respectively. The interpretation of strengths is still as in the Luce model: conditional upon the outcome being an outright win for one item, the probabilities are in the ratios .

Still it is the case — as in the Bradley-Terry, Luce and Davidson models — that only relative values of the strength parameters affect the model. Moreover, as before, the tie probabilities are all maximized when strengths are equal.

The interpretation of is like that of in the Davidson model. For example, conditional upon not being included in the winning choice, the possible outcomes are , in which case is — as before — the probability of a 2-way tie between and when . Alternatively, if we condition only upon the outcome not being a 3-way tie, then with the probability of a 2-way tie is .

The interpretation of , similarly, is simplest in terms of the hypothetical situation of equal strengths (i.e., the situation where, for any given value of , the probability of a 3-way tie is maximized). The probability of a 3-way tie is then .

The extensions of properties 5 and 6 listed above for the Davidson model are as follows. The model is a full exponential family, whose sufficient statistics are:

  • for each item, its observed number of outright ‘wins’, plus of its observed number of 2-way ties, plus of its observed number of 3-way ties;

  • the total number of 2-way ties seen, in all comparisons made;

  • the total number of 3-way ties seen, in all comparisons made.

As a consequence, in a balanced round-robin tournament of 3-way comparisons involving items in total, the maximum likelihood estimates are ordered in exactly the same way as are simple, item-specific ‘points totals’, with 6 points awarded for an outright win, 3 points for a 2-way tied win, and 2 points for a 3-way tie. This holds regardless of the values of and .

Further discussion of the properties of the Davidson-Luce model is deferred to section 2.3. In the next subsection we show how this Davidson-Luce model extends, in an obvious way, to a choice made from any number of items.

2.3 Choice from any finite set

The model for , as described above, immediately suggests the form of the Davidson-Luce model for any .

In any given comparison, label the items being compared by , and denote by the set of possible ‘winning’ choices that might be made from the items being compared. For example, indicates an outright winner, indicates a 2-way tie, and so on, up to and including the possibility , which indicates that all items tied.

The Davidson-Luce model stipulates that the probability of any such choice is proportional to

(1)

where denotes the cardinality of set . Thus can take values in . The adjustable tie-prevalence parameters are ; the value of can be set arbitrarily to be 1, so is not actually a parameter in the model but is included here for presentational tidiness.

The constant of proportionality is just the normalizing constant, the reciprocal of the sum of over all possible choice sets . That normalizing constant can be straightforwardly computed, if needed, but, it involves a rapidly increasing number of terms as the value of increases. The model’s log-linear representation, which follows as a direct extension of Fienberg (1979)

, allows for simple iterative computation of estimates and associated standard errors without any need to evaluate the likelihood itself. A numerical illustration is provided in the Appendix, to show how this works in detail.

The Davidson (1970) model is a special case of the Davidson-Luce model, with and . The Luce model (Luce, 1959, 1977) is the special case in which ties are not allowed: that is, for all .

Here we briefly describe how the Davidson model properties listed above (in section 1) extend to the Davidson-Luce model.

  1. The geometric means all have the same dimensions as the strengths , and so the tie-prevalence parameters are all dimensionless. It was shown in section 2.2 above how to construct meaningful interpretations for those parameters.

  2. Conditional upon the outcome of a comparison not being a tie, the probability that wins is , for any in the comparison set . The Davidson-Luce model thus maintains compatibility with Luce’s axiom.

  3. As before, dependence on item strengths is only ever through their relative values.

  4. The tie probabilities all are in the form of geometric means, which are maximized when the items being compared have equal strengths.

  5. The Davidson-Luce model is still a full exponential family model as before, the sufficient statistics being

    • for each item, the total number of wins, counting a tied win fractionally in the obvious way;

    • the total numbers of ties seen, of each order (i.e., the count of 2-way ties, the count of 3-way ties, etc.).

    A straightforward extension of the log-linear representation in Fienberg (1979) leads to efficient solution of maximum likelihood equations — without any need to compute the likelihood itself — using standard software for generalized linear models.

  6. As already exemplified in section 2.2, the Davidson-Luce model continues to yield exact agreement with points-based league tables for fully balanced tournaments, provided that points are divided equally whenever items share a tied win.

In summary, then: the Davidson-Luce model retains the many appealing features of the Davidson model for ties, while extending the scope of application substantially beyond the limited domain of pair-comparison data.

3 Concluding remarks

A specific application of the ideas developed here is to the Plackett-Luce model (Turner et al., 2019), which generalizes Bradley-Terry models to analysis of rankings. In a Plackett-Luce model, it would typically be the case that tied “winners” can occur at any stage of the sequence of choices that forms a multi-item ranking; and this flexibility is what is implemented in the PlackettLuce package.

The PlackettLuce package also implements a prior penalty for Plackett-Luce models, which regularizes the likelihood with the aim of improving estimation. In particular, use of that prior penalty ensures that the conditions of Ford Jr (1957), which ensure existence and finiteness of parameter estimates, are always satisfied. The prior penalty, as implemented in the PlackettLuce package, requires no modification at all to work with the Davidson-Luce model. For full details on the PlackettLuce package and its use, see Turner et al. (2019).

References

  • Bradley and Terry (1952) Bradley, R. A. and M. E. Terry (1952). Rank analysis of incomplete block designs I: The method of paired comparisons. Biometrika 39, 324–45.
  • Davidson (1970) Davidson, R. R. (1970). On extending the Bradley-Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association 65(329), 317–328.
  • Fienberg (1979) Fienberg, S. E. (1979). Log linear representation for paired comparison models with ties and within-pair order effects. Biometrics 35, 479–481.
  • Ford Jr (1957) Ford Jr, L. R. (1957). Solution of a ranking problem from binary comparisons. The American Mathematical Monthly 64(8P2), 28–33.
  • Luce (1959) Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. New York: Wiley.
  • Luce (1977) Luce, R. D. (1977). The choice axiom after twenty years. Journal of Mathematical Psychology 15(3), 215–233.
  • Turner et al. (2019) Turner, H., J. van Etten, D. Firth, and I. Kosmidis (2019). Modelling rankings in R: The PlackettLuce package. arXiv preprint 1810.12068.

Appendix A Appendix: Computation via Poisson log-linear model representation

Here we use a small, artificial example to show details of implementation of the Davidson-Luce model in R, using maximum likelihood via a log-linear representation as suggested by Fienberg (1979).

a.1 Davidson-Luce model for a small, contrived example

We imagine here a 4-player round-robin tournament in which each ‘contest’ involves exactly 3 of the 4 players. A single round-robin tournament thus has 4 contests, in this setting.

The data we will use are as follows:

triples_round_robin <- matrix(c(
    NA, 1, 0, 0,
    1, NA, 1, 0,
    0,  1, NA, 1,
    1, 1, 1, NA),
    4, 4, byrow = TRUE,
    dimnames = list(contest = c("BCD", "ACD", "ABD", "ABC"),
                    winner = c("A", "B", "C", "D"))
    )
triples_round_robin
##        winner
## contest  A  B  C  D
##     BCD NA  1  0  0
##     ACD  1 NA  1  0
##     ABD  0  1 NA  1
##     ABC  1  1  1 NA

The first contest is won outright by player ; the second is tied between and ; the third is tied between and ; and the fourth is a 3-way tie between , and .

The simple tournament-scoring system described in Section 2.2, with 6 points shared across the winners of each contest, gives points totals as follows:

6 * colSums(triples_round_robin / rowSums(triples_round_robin, na.rm = TRUE),
            na.rm = TRUE)
##  A  B  C  D
##  5 11  5  3

So in this small tournament is the clear winner, with and jointly second.

To fit the Davidson-Luce model via its Poisson log-linear representation, we first expand the data to a form that has a separate row for each possible outcome of every contest. To do this we will use a special-purpose function named expand_outcomes (whose definition is shown at the end, below).

expanded_data <- expand_outcomes(triples_round_robin)
print(expanded_data, digits = 2)
##          comparison    A    B    C    D delta2 delta3 outcome
## 1: B              1 0.00 1.00 0.00 0.00      0      0       1
## 1: C              1 0.00 0.00 1.00 0.00      0      0       0
## 1: D              1 0.00 0.00 0.00 1.00      0      0       0
## 1: B=C            1 0.00 0.50 0.50 0.00      1      0       0
## 1: B=D            1 0.00 0.50 0.00 0.50      1      0       0
## 1: C=D            1 0.00 0.00 0.50 0.50      1      0       0
## 1: B=C=D          1 0.00 0.33 0.33 0.33      0      1       0
## 2: A              2 1.00 0.00 0.00 0.00      0      0       0
## 2: C              2 0.00 0.00 1.00 0.00      0      0       0
## 2: D              2 0.00 0.00 0.00 1.00      0      0       0
## 2: A=C            2 0.50 0.00 0.50 0.00      1      0       1
## 2: A=D            2 0.50 0.00 0.00 0.50      1      0       0
## 2: C=D            2 0.00 0.00 0.50 0.50      1      0       0
## 2: A=C=D          2 0.33 0.00 0.33 0.33      0      1       0
## 3: A              3 1.00 0.00 0.00 0.00      0      0       0
## 3: B              3 0.00 1.00 0.00 0.00      0      0       0
## 3: D              3 0.00 0.00 0.00 1.00      0      0       0
## 3: A=B            3 0.50 0.50 0.00 0.00      1      0       0
## 3: A=D            3 0.50 0.00 0.00 0.50      1      0       0
## 3: B=D            3 0.00 0.50 0.00 0.50      1      0       1
## 3: A=B=D          3 0.33 0.33 0.00 0.33      0      1       0
## 4: A              4 1.00 0.00 0.00 0.00      0      0       0
## 4: B              4 0.00 1.00 0.00 0.00      0      0       0
## 4: C              4 0.00 0.00 1.00 0.00      0      0       0
## 4: A=B            4 0.50 0.50 0.00 0.00      1      0       0
## 4: A=C            4 0.50 0.00 0.50 0.00      1      0       0
## 4: B=C            4 0.00 0.50 0.50 0.00      1      0       0
## 4: A=B=C          4 0.33 0.33 0.33 0.00      0      1       1

The expanded_data object is an ordinary data frame that can be used with R’s standard functions for fitting generalized linear models. The Davidson-Luce model could now just be fitted by maximum likelihood in R through a call to glm(), as a Poisson log-linear model as follows:

DLmodel <- glm(outcome ~ comparison + A + B + C + D + delta2 + delta3,
               family = poisson, data = expanded_data)

But here the factor named comparison is included purely for technical reasons, to ensure that the fitted probabilities (over the 7 possible outcomes in each contest here) sum to 1. That factor is not of any interest, and so for tidiness — as well as a slight improvement in computational efficiency — we will use gnm (from the gnm package) instead of glm. The advantage of gnm here is that it allows the ‘nuisance’ factor comparison to be included more cleanly in the model via the eliminate argument:

library(gnm)
DLmodel <- gnm(outcome ~ A + B + C + D + delta2 + delta3, eliminate = comparison,
               family = poisson, data = expanded_data)
DLmodel
##
## Call:
##
## gnm(formula = outcome ~ A + B + C + D + delta2 + delta3, eliminate = comparison,
##     family = poisson, data = expanded_data)
##
## Coefficients of interest:
##      A       B       C       D  delta2  delta3
##  2.071   6.864   2.071      NA   2.390   3.249
##
## Deviance:            11.35986
## Pearson chi-squared: 14.20569
## Residual df:         19

The reported model parameters are on the log scale; and the parameterization here has arbitrarily set to 1, to resolve parameter redundancy.

So, for example is estimated to be .

The two tie-prevalence parameters here are both estimated to be very large: and . This is due to the deliberately common occurrence of ties in this dataset, in order to demonstrate how ties are handled; and also the fact that the estimated player strengths here differ widely. (The data seen here would suggest that in notional contests where players all have equal strengths, ties would be extremely common.)

a.2 Agreement with full round-robin ‘points totals’

Since this was a fully balanced round robin tournament design, then as mentioned in Section 2.2 the fit of the Davidson-Luce model should agree exactly with the simple points totals that were calculated above. Those points totals do indeed agree with their expectations under the fitted Davidson-Luce model:

DLfitted <- predict(DLmodel, type = "response")
print(DLfitted, digits = 2)
##     1: B     1: C     1: D   1: B=C   1: B=D   1: C=D 1: B=C=D     2: A
##  0.34278  0.00284  0.00036  0.34071  0.12096  0.01101  0.18133  0.02967
##     2: C     2: D   2: A=C   2: A=D   2: C=D 2: A=C=D     3: A     3: B
##  0.02967  0.00374  0.32385  0.11498  0.11498  0.38312  0.00284  0.34278
##     3: D   3: A=B   3: A=D   3: B=D 3: A=B=D     4: A     4: B     4: C
##  0.00036  0.34071  0.01101  0.12096  0.18133  0.00200  0.24096  0.00200
##   4: A=B   4: A=C   4: B=C 4: A=B=C
##  0.23950  0.02181  0.23950  0.25423
expected_points_totals <- 6 * colSums(expanded_data[, c("A","B","C","D")] * DLfitted)
expected_points_totals
##         A         B         C         D
##  5.000000 11.000000  5.000000  3.000001

The actual points totals, from above, were 5, 11, 5 and 3. The agreement is exact, apart from numerical error due to the iteration-stopping rule that was used by gnm.

a.3 Illustration of tie-prevalence interpretations

The interpretation of tie-prevalence parameters and was described in Section 2.2, in terms of the probabilities in a notional contest involving only players of equal ability.

Merely as a numerical illustration of those interpretations, we re-fit here the Davidson-Luce model, but with the constraint that strengths are all equal to 1 (so that their logarithms are all zero).

DL_equal_strengths <- update(DLmodel, . ~ . - A - B - C - D)
DL_equal_strengths
##
## Call:
## gnm(formula = outcome ~ delta2 + delta3, eliminate = comparison,
##     family = poisson, data = expanded_data)
##
## Coefficients of interest:
## delta2  delta3
## 0.6931  1.0986
##
## Deviance:            14.90944
## Pearson chi-squared: 24
## Residual df:         22

The tie-prevalence estimates here are and . Agreement with the detailed interpretations shown in Section 2.2 can thus be checked as follows:

coefs <- coef(DL_equal_strengths)
print(round(coefs, 4))
## Coefficients of interest:
## delta2 delta3
## 0.6931 1.0986
delta2 <- exp(coefs[1])
delta3 <- exp(coefs[2])
delta2/(1 + delta2)
##    delta2
## 0.6666667
delta3/(3 + 3*delta2 + delta3)
## delta3
##   0.25

These values agree with what was seen in the data, which was 2 two-way ties out of the 3 contests whose outcome was not a 3-way tie (so ), and one 3-way tie out of the 4 contests observed in total (so ).

a.4 Definition of the function used to expand the data

For completeness here, we show the full definition of the function that made the dataframe named expanded_data in the above.

The function shown here is very much a prototype, not programmed for efficiency, robustness or scalability.

expand_outcomes
## function(m) {
##     n_comparisons <- nrow(m)
##     n_items <- ncol(m)
##     items <- colnames(m)
##     rvec <- apply(m, 1, function(row) sum(!is.na(row)))
##     tvec <- apply(m, 1, function(row) sum(na.omit(row)))
##     maxt <- max(tvec)
##     if (maxt > 1) delta_names <- paste0("delta", 2:maxt)
##     n_possible_outcomes  <- integer(n_comparisons)
##     for (i in 1:n_comparisons) {
##         n_possible_outcomes[i] <- sum(choose(rvec[i], 1:(min(rvec[i], maxt))))
##     }
##     result <- matrix(0, sum(n_possible_outcomes), n_items + maxt + 1)
##     colnames(result) <- c("comparison", colnames(m), delta_names, "outcome")
##     rownames(result) <- as.character(1:nrow(result))
##     filled <- 0
##     for (comparison in 1:n_comparisons){
##         involved <- items[!is.na(m[comparison, ])]
##         for (t in 1:maxt) {
##             combs <- combn(involved, t)
##             for (index in 1:ncol(combs)){
##                 result[filled + index, 1] <- comparison
##                 result[filled + index, 1 + which(items %in% combs[, index])] <- 1/t
##                 if (t > 1) {
##                     result[filled + index, n_items + t] <- 1
##                 }
##                 if (all(na.omit(t * result[filled + index, 1 + (1:n_items)] -
##                                 m[comparison, ]) == 0)) {
##                     result[filled + index, "outcome"] <- 1
##                 }
##                 rownames(result)[filled + index] <-
##                     paste(comparison, paste0(combs[, index], collapse = "="),
##                           sep = ": ")
##             }
##             filled <- filled + ncol(combs)
##         }
##     }
##     result <- as.data.frame(result)
##     result$comparison <- as.factor(result$comparison)
##     return(result)
## }
## <bytecode: 0x36ad6c0>