# A New Correlation Coefficient for Aggregating Non-strict and Incomplete Rankings

We introduce a correlation coefficient that is specifically designed to deal with a variety of ranking formats including those containing non-strict (i.e., with-ties) and incomplete (i.e., null) preferences. The new measure, which can be regarded as a generalization of the seminal Kendall tau correlation coefficient, is proven to be equivalent to a recently developed axiomatic ranking distance. In an effort to further unify and enhance the two robust ranking methodologies, this work proves the equivalence an additional axiomatic-distance and correlation-coefficient pairing in the space of non-strict incomplete rankings. In particular, the bridging of these complementary theories reinforces the singular suitability of the featured correlation coefficient to solve the general consensus ranking problem. The latter premise is further bolstered by an accompanying set of experiments on random instances, which are generated via a herein developed sampling technique connected with the classic Mallows distribution of ranking data. To carry out the featured experiments we devise a specialized branch and bound algorithm that provides the full set of alternative optimal solutions efficiently. Applying the algorithm on the generated random instances reveals that the featured correlation coefficient yields consistently fewer alternative optimal solutions as data becomes noisier.

## Authors

• 1 publication
• 1 publication
• 1 publication
• ### Quantifying consensus of rankings based on q-support patterns

Rankings, representing preferences over a set of candidates, are widely ...
05/30/2019 ∙ by Zhengui Xue, et al. ∙ 0

• ### Aggregating Incomplete and Noisy Rankings

We consider the problem of learning the true ordering of a set of altern...
11/02/2020 ∙ by Dimitris Fotakis, et al. ∙ 0

• ### Testing for strict stationarity in a random coefficient autoregressive model

We propose a procedure to decide between the null hypothesis of (strict)...
01/04/2019 ∙ by Lorenzo Trapani, et al. ∙ 0

• ### On The Usage Of Average Hausdorff Distance For Segmentation Performance Assessment: Hidden Bias When Used For Ranking

Average Hausdorff Distance (AVD) is a widely used performance measure to...
09/01/2020 ∙ by Orhun Utku Aydin, et al. ∙ 0

• ### The correlation coefficient between citation metrics and winning a Nobel or Abel Prize

Computing such correlation coefficient would be straightforward had we h...
09/13/2021 ∙ by M. V. Simkin, et al. ∙ 0

• ### Bayesian Inference of Natural Rankings in Incomplete Competition Networks

Competition between a complex system's constituents and a corresponding ...
06/06/2013 ∙ by Juyong Park, et al. ∙ 0

• ### Coronavirus and sports leagues: how to obtain a fair ranking if the season cannot resume?

Many sports leagues are played in a tightly scheduled round-robin format...
05/05/2020 ∙ by László Csató, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The consensus ranking problem (i.e. ranking aggregation) is at the center of many group decision-making processes. It entails finding an ordinal vector or

ranking of a set of competing objects that minimizes disagreement with a profile of preferences (represented as ranking vectors). Common examples include corporate project selection, research funding processes, and academic program rankings (Hochbaum and Levin, 2006). Moreover, the mathematical measures and aggregation algorithms devised to solve the consensus ranking problem often find ready application in many other fields. For example, in Information Retrieval these fundamental tools have been used to compare, aggregate, and evaluate the accuracy of metasearch engine lists Hassanzadeh and Milenkovic (2014)

. Hence, although this work considers the group decision-making context of ranking aggregation for ease of interpretability, many of its results could be readily adaptable to many other contexts such as Artificial Intelligence

(Betzler et al., 2014) and Biostatistics Lin (2010b).

Although the mathematical roots of consensus ranking trace back to the development of voting systems of de Borda (de Borda, 1781) and Condorcet (Condorcet, 1785), significant work remains to deal with real-world situations that upend many of the problem’s rigid and long-running assumptions. These issues have garnered renewed interest from the Operations Research community owing largely to the general intractability engendered by the more robust ranking aggregation systems (Sen et al., 2014). Even so, it may be beneficial to consider transdisciplinary efforts in solving close variants of this problem. This work incorporates concepts from the statistical literature where the analog median ranking problem has been used for classification, prediction, and several other applications (Heiser and D’Ambrosio, 2013; D’Ambrosio and Heiser, 2016). The fundamental goal for intertwining these viewpoints is to reinforce and advance theoretical and computational aspects of consensus ranking when dealing with indispensable forms of ranking data. Additionally, this unison is intended to yield insights and perspectives that are generalizable to many other areas where similar problems have been considered.

This work deals with the consensus ranking problem in which the set of input rankings may contain ties and may be incomplete. This variety of preferences is the rule rather than the exception in group decision-making (Emond and Mason, 2002), making it imperative to utilize frameworks that possess this flexibility; otherwise, judges are implicitly forced to make arbitrary and/or careless choices. Moreover, since there is typically a finite budget or set of benefits that is to be allocated commensurate with the competitors’ positions in the consensus ranking, the chosen frameworks must employ robust measures that align with the given context. Specifically, this work assumes a neutral treatment of incomplete rankings, meaning a judge’s preferences over her unranked objects are unknown—because in the most general case it is assumed she does not evaluate them—and, therefore, no inferences should be made about her preferences of these objects relative to other unranked or ranked objects. Such a treatment is particularly pertinent in situations where the evaluation of a large object set can be realistically accomplished only via the allocation of smaller subsets—which may differ both in content and size—to various judges. Practical reasons for this include time, expertise, and conflict of interest constraints (Hochbaum and Levin, 2010). It is also prudent to partition such a large undertaking into smaller tasks given that objectivity deteriorates and frustration grows as the number of alternatives evaluated increases (Saaty and Ozdemir, 2003; Basili and Vannucci, 2015). The theory and algorithms developed in this work are tailored to deal with these and other commonplace considerations.

The herein assumed neutral treatment of incomplete rankings clearly differs from the top- treatment in which unranked objects are assumed to be tied for ordinal position , making them all strictly less-preferred than the explicitly ranked objects (e.g., see Mamoulis et al. (2007); Klementiev et al. (2008)). Although the featured correlation coefficient is not specially designed for the latter context, it can be provisionally (though not ideally) adapted for this purpose (see §4.1).

This work makes the following novel contributions. First, it proves that the ranking correlation coefficient devised in Emond and Mason (2002) is inadequate for dealing with incomplete rankings when the unranked objects do not connote any preferential information. Second, it develops the ranking correlation coefficient for dealing with a wide variety of ranking inputs; this measure is equivalent to when the input rankings are restricted to be complete and it is equivalent to the seminal Kendall correlation coefficient (Kendall, 1938) when they are further restricted to not contain ties. Third, it proves that the featured correlation coefficient is equivalent to the axiomatic ranking distance recently developed in Moreno-Centeno and Escobedo (2016). Thus, as a whole, the first three contributions refine and unify distance and correlation-based ranking aggregation. Fourth, it devises a customized branch and bound algorithm for solving the non-strict incomplete ranking aggregation problem and for obtaining all alternative optimal solutions efficiently. Fifth, it extends the repeated insertion model of Doignon et al. (2004) to sample non-strict incomplete rankings from statistical distributions linked with the classic Mallows -distribution of ranking data Mallows (1957). As motivated by the foregoing discussion, it is reasonable to expect that these contributions will find useful interpretations within other fields.

The remainder of the paper is organized as follows. §2 introduces the notation and conventions utilized throughout this work. §3 reviews the pertinent literature on axiomatic distances and correlation coefficients for quantifying differences and similarities between rankings. §4 presents key theoretical results that strengthen the correlation-based framework for handling incomplete rankings: §4.1 demonstrates the inadequacy of Emond and Mason’s correlation coefficient; §4.2 derives the correlation coefficient featured in this work; §4.3 establishes the equivalence between two key axiomatic-distance and correlation-coefficient pairings and derives additional analytical insights from these connections. §5 presents a set of new algorithmic tools and experiments to compare the usefulness of these measures to solve the non-strict incomplete ranking aggregation problem: §5.1 develops an exact branch and bound algorithm; §5.2 devises an efficient statistical sampling framework used to construct nontrivial instances motivated by real-world scenarios; §5.35.5 employ these scenarios to test two properties desired of ranking aggregation measures: decisiveness and electoral fairness. Lastly, §6 concludes the work and discusses future avenues of research.

## 2 Notation and Preliminary Conventions

Denoting as a set of competing objects, a judge’s ranking or ordinal evaluation of is characterized by a vector of dimension of , whose th element denotes the ordinal position assigned to object . If , is said to prefer to (or to disprefer to ), and when , is said to tie and for position , where and . Additionally, when is assigned a null value—heretofore signified by the symbol “”— is said to be unranked within ; the objects explicitly ranked in are denoted by the subset (i.e., for ). For example, in the -object ranking is preferred over and ; and are tied for the second position but both are preferred over ; is left unranked; and . It is important to emphasize that, although any object that is unranked within receives the same assignment , it is not considered tied with other unranked objects or better/worse than the ranked objects.

Later sections also refer to the object-ordering induced by a mapping , where denotes the set of weak orders (or complete preorders) on objects. That is, sorts the objects in from best to worst, according to their ranks in . For example, for , . Extending this notation, specifies the th-highest ranked object in when does not contain ties (Doignon et al., 2004)—that is, is a bijection in this case and the inverse function returns a linear order. When contains ties, sorts the objects into preference equivalence classes. For example, for , . In the case of ties, the inverse mapping returns the ranking obtained by labeling each object with its corresponding equivalence class position in . For example, for , .

Various ranking aggregation systems may not be defined or equipped to properly handle the full variety of ranking data formats alluded to in the preceding paragraphs. For this reason, the following definitions highlight three primary ranking spaces by which they can be categorized.

###### Definition 1.

Let denote the broadest ranking space consisting of all possible strict, non-strict, complete, and incomplete rankings—corresponding to rankings without ties, with and without ties, full, and partial and full, respectively. Since non-strict rankings encompass strict rankings and incomplete rankings encompass complete rankings, is denoted alternatively as the space of non-strict incomplete rankings.

###### Definition 2.

Let denote the space of complete rankings over objects, which consists of all possible non-strict (and strict) rankings in which every object must be explicitly ranked (i.e., partial evaluations are disallowed).

###### Definition 3.

Let denote the space of strict rankings over objects, which consists of all possible incomplete (and complete) rankings in which no objects may be tied.

Notice that Definition 3 explicitly excludes the subset of unranked objects within a ranking (i.e. ) as being tied even though every unranked object receives the same assignment . This convention is adopted to convey a null or unknown preference over all unranked objects within a ranking in accordance with the neutral treatment of incomplete rankings assumed throughout this paper.

From the above definitions, it is evident that , , and and are mutually incomparable with respect to set containment. To describe the ranking aggregation problem addressed in this work, let denote an arbitrary ranking correlation function. The correlation-based non-strict incomplete ranking aggregation problem (NIRAP) is stated formally as:

 argmaxr∈ΩCK∑k=1˙τ(r,ak), (1)

where for (i.e., is the index of each judge or ranking). Alternatively, adopting the same notation and denoting as an arbitrary ranking distance function, the distance-based NIRAP is stated formally as:

 argminr∈ΩCK∑k=1˙d(r,ak).−2mm (2)

Expression (1) can be intuitively interpreted as the problem of finding a ranking that maximizes agreement—quantified according to —with a set of non-strict incomplete rankings; Expression (2) can be intuitively interpreted as the problem of finding a ranking that minimizes disagreement—quantified according to —with the same inputs. As shown in §4.2, for select axiomatic-distance and correlation-coefficient pairings, the two respective optimization problems are equivalent. It is imperative to point out that, although the input rankings are allowed to be incomplete to allow flexibility of preference expression, the consensus ranking is required to lie in the space of complete rankings—that is, is a constraint in both problems. This condition is enforced because obtaining an evaluation for all considered objects is relevant for many group decision-making situations, but it may be modified to suit other contexts (e.g., top- ranking aggregation (Fagin et al., 2003)).

## 3 Literature Review

The principal focus of this work is on deterministic metric-based methods for comparing and aggregating rankings, which are regarded as the most robust methodologies within Operation Research and Social Choice (Brandt et al., 2016). The reader is directed to Cook (2006) for a review of score or utility based methods, which are relative more computationally efficient but cannot fulfill certain fundamental social choice properties associated with voting fairness (e.g., the Condorcet criterion (Condorcet, 1785) and its extensions (Young, 1988; Young and Levenglick, 1978)). Additionally, there is a rich body of literature on nondeterministic or model-based ranking aggregation methods (e.g., see (Fligner and Verducci, 1986; Mallows, 1957; Marden, 2014)). While these often rely on axiomatic distances, they are incomparable with the featured context in various notable respects including their assumptions, aggregation processes, and outputs.

### 3.1 Axiomatic Distances

Several axiomatic distances have been proposed to derive a consensus ranking, with each method solving a different variant of this classic problem. Most prominently, these distances are established in Moreno-Centeno and Escobedo (2016); Kemeny and Snell (1962); Bogart (1973, 1975); Cook et al. (1986); Dwork et al. (2001); Cook et al. (2007). Within each work, the respective distance is typically advocated as the most suitable for aggregating inputs drawn from a specific ranking space through a set of axioms it uniquely satisfies. As Moreno-Centeno and Escobedo (2016) demonstrated, however, all but the last distance on the list are either unable to deal with the broadest ranking space or they inadvertently induce significant systematic biases associated with unfairness when applied (or extended) to solve the distance-based NIRAP. Thus, in pursuit of the fundamental goals of allowing a wide variety of ranking data and upholding a fair or equitable process, this subsection focuses on this recently developed distance and on the precursor distances upon which it is founded.

The first axiomatic distance was introduced by Kemeny and Snell in Kemeny and Snell (1962) to aggregate non-strict complete rankings; the distance function, written here succinctly as , quantifies the disagreement between a pair of ranking vectors as follows:

 dKS(a,b) =1γn∑i=1n∑j=1∣∣sign(ai−aj)−sign(bi−bj)∣∣, (3)

where and is a positive constant associated with a chosen minimum positive distance unit. In Kemeny and Snell (1962) is set to 2 corresponding to a minimum distance unit of 1 (since each object pair is counted twice in the above expression), but henceforth it is fixed to 4 corresponding to a minimum positive distance unit of 1/2, which does not affect the solution to Problem (2) but has a convenient interpretation for handling ties (Moreno-Centeno and Escobedo, 2016). Put simply, measures the number of pairwise rank reversals required to turn into ; when the rankings do not contain ties, this is also known as the bubble-sort distance (Hochbaum and Levin, 2006). The distance is synonymous with robust ranking aggregation in space (Ailon et al., 2008) owing to the combination of social choice properties for mitigating manipulation, enforcing fairness, and reducing individual human biases that it uniquely satisfies (see Young (1988); Young and Levenglick (1978)).

Although was not originally defined to handle incomplete rankings, a seemingly straightforward extension was devised in (Cook et al., 2007) and (Dwork et al., 2001). The underlying axioms for this distance, referred to as the Projected Kemeny Snell distance and written here succinctly as , are provided in Moreno-Centeno and Escobedo (2016). The corresponding distance function is defined as:

 dP−KS(a,b)=dKS(a|(Va⋂Vb),b|(Va⋂Vb)), (4)

where in this case and where , denote the projections of each ranking onto the subset of objects evaluated in both rankings. In other words, enforces the intuitive interpretation that ranking disagreements should be based only on the objects ranked in common by and rather than in the differences of which objects were ranked (or not ranked) by each judge. That said, when is utilized to solve the NIRAP, the consensus ranking tends to favor judges who rank a higher number of objects since distances in higher dimensional spaces tend to be comparatively larger. This means that when unequal numbers of items are evaluated, induces systematic inegalitarianism, by which unequal voting power is imposed on each input ranking. Consequently, some judges may cast disproportionate influence in the aggregation process, which can lead to unfair and quasi-dictatorial outcomes—i.e., a judge can dominate the aggregate ranking despite the oppositely aligned preferences of a large majority.

To overcome the principal drawback of , Moreno-Centeno and Escobedo (2016) developed the normalized projected Kemeny Snell distance, written here succinctly as . The distance is equivalent to when inputs are restricted to space and, thus, it can be regarded as a generalization of the Kemeny Snell distance to the broadest space . The authors proved that uniquely satisfies an intuitive set of axioms desired of any distance defined in space (see Appendix 7.1). The corresponding distance function is defined as:

 dNP−KS(a,b)=⎧⎨⎩dKS(a|(Va⋂Vb),b|(Va⋂Vb))¯n(¯n−1)/2if ¯n≥2,0otherwise, (5)

where since the Kemeny-Snell framework relies on pairwise comparisons. The above denominator ensures , regardless of how many objects are ranked or unranked by or . In essence, this gives equal voting power to each input ranking or judge in the aggregation process. A pragmatic benefit of achieving egalitarianism irrespective of the different numbers of objects ranked is the elimination of the often unenforceable/unrealistic requirement of having to allocate an equal number of objects for each judge to evaluate. Indeed, in many cases a uniform assignment of objects may be impossible to implement due to differing expertise, disagreeing schedules, unplanned exemptions (e.g., conflicts of interest with the evaluated alternatives), and plenty of other practical reasons.

### 3.2 Correlation Coefficients

Correlation coefficients are an alternative methodology for quantifying differences in rankings with an extensive history and wide array of applications in statistical literature—e.g., see Yilmaz et al. (2008); Lehmann (1966); Kochar and Gupta (1987); Liu et al. (2012). Naturally, the agreement between judges and is measured on the interval , where the minimum and maximum values indicate complete disagreement and complete agreement, respectively. The most prominent is the Kendall (tau) correlation coefficient Kendall (1938), which was subsequently adapted into the correlation coefficient in Kendall (1948) to handle non-strict rankings. However, Emond and Mason (2002) gave compelling evidence that exhibits serious flaws when handling non-strict rankings; for example, yields an undefined correlation value of 0/0 when comparing the all-ties ranking to itself or to any other non-strict ranking. To replace it, the authors introduced the (tau-extended) ranking correlation coefficient, which relies on an alternative score matrix representation of denoted as whose individual elements are defined as:

 (6)

where . Here, a tie connotes a positive statement of agreement; conversely, the score matrix (not shown) treats a tie as a declaration of indifference by assigning it a score of 0. Thus, adopting the former interpretation, the correlation between rankings and (with underlying score matrices and ) is given by the function:

 τx(a,b)=∑ni=1∑nj=1aijbijn(n−1). (7)

Furthermore, Emond and Mason (2002) proved that and are equivalent representations of the unique measure Kemeny and Snell (1962) satisfying the axioms in space , by connecting them via the equation:

 τx(a,b)=1−γdKS(a,b)n(n−1), (8)

where is a constant associated with the minimum distance unit (see Equation (3))—since this work adopts a minimum distance unit of , it fixes in Equation (8). This connection renders with the intuitive axiomatic foundation of . At the same time, it suggests that the inadequacies of the latter to handle incomplete rankings described in Moreno-Centeno and Escobedo (2016) carry over to the former. This premise is further explored in the ensuing section.

## 4 Handling Incomplete Rankings via Correlation Coefficients

Up to this point, there has not been a ranking correlation coefficient explicitly tailored for dealing with non-strict incomplete rankings (those belonging to the broadest ranking space ), to the best of our knowledge. Indeed, although Emond and Mason (2002) suggested that could fulfill this extended role, this assertion has not been formally proved nor empirically validated. Hence, the first of the ensuing subsections examines this hypothesis. Afterward, §4.2 introduces the ranking correlation coefficient for dealing with rankings in space , along with the properties and axioms it satisfies. This new ranking correlation coefficient is shown to be equivalent to and to when the input rankings are restricted to lie in spaces and , respectively. Then, §4.3 establishes the equivalence of with the axiomatic distance as well as the equivalence of with when the input rankings lie in space , and it elaborates on the consequences.

### 4.1 Inadequacy of the Kendall Tau-Extended Correlation Coefficient

This subsection provides cogent evidence that is not an adequate measure for quantifying and aggregating differences between incomplete rankings. Specifically, counter to what is claimed in Emond and Mason (2002), employing produces incongruous and counterintuitive results when a judge’s unranked objects should convey no preferential information. The veracity of these assertions is established via intuitive examples and the accompanying discussion.

Table 1 displays a simple instance consisting of ten incomplete rankings in space over object set . Since, for every pair of objects two or more judges strictly prefer over while only one judge ties them, for , it is reasonable to expect the egalitarian outcome as the unique consensus ranking. However, using the correlation coefficient, the unique optimum is , mirroring the preferences of exactly; most strikingly, this occurs even though is strictly preferred over by three of the four judges who evaluate . Effectively, wields quasi-dictatorial influence due to the relatively higher number of items it ranks. This indicates that applying to aggregate incomplete rankings inadvertently imposes unequal voting power into the aggregation process, specifically benefiting rankings with higher completeness. Conversely, the correlation coefficient introduced in §4.2 overcomes these key issues by effectively assigning equal ranking power to each judge—for the Table 1 example, yields the unique optimal solution .

The inadequacy of to handle incomplete rankings can be discerned at a more fundamental level from its inability to yield the expected values 1 and when an incomplete ranking is correlated with itself and with its reverse ranking, respectively. In fact, the achievable correlation range shrinks as the number of ranked objects decreases, which translates into systematic inegalitarianism. For instance, when and when . Conversely, the new measure introduced in §4.2 prevents the correlation range contraction by discounting the impact of unranked objects through the inclusion of a scaling factor.

Although this work does not advocate applying , , or directly for top- ranking aggregation, a provisional approach for this special context is as follows. First, fill in every unranked position with the value ; second, solve the modified instance utilizing if the input contains ties and otherwise. Alternatively, since is equivalent to and in the restricted spaces and , respectively (see §4.2), the ranking correlation coefficient introduced in this work can be employed in both cases.

### 4.2 Derivation of the Scaled Kendall Tau-Extended Correlation Coefficient

To quantify the agreement between non-strict incomplete rankings via correlation coefficients, a fundamental requirement is that the correlation between any pair of rankings must lie within the interval . Specifically, the and 1 values must be achieved whenever and completely agree and completely disagree, respectively; otherwise, a value lying strictly in the interior of the interval should be returned commensurate with the level of agreement between the rankings. These and other basic requirements that a given ranking correlation function should satisfy in space are captured through the set of metric-like axioms exhibited in Table 2. As explained in §4.1, cannot fulfill some of these essential requirements in its current form. Hence, this subsection derives a new correlation coefficient, which will be proven to satisfy the Table 2 axioms in §4.3. As a first step, we define a corresponding score matrix representation for as:

 aij (9)

where . Note that this score matrix can be obtained by extending Equation (6) to also assign whenever object or (or both) is unranked in and, thus, it is equivalent to the score matrix when the input rankings are complete. This extension was cursorily proposed in Emond and Mason (2002), although it was neither explicitly stated nor implemented therein. It is chosen as the basis of the new correlation coefficient also because its treatment of ties is equivalent to the Kemeny Snell “half-flip” metric, which assigns only half of a rank reversal between and whenever one ties but the other professes a strict preference for over or vice versa.

As a second step, consider score matrices and respectively defined according to Equation (9) and their associated matrix inner product:

 n∑i=1n∑j=1aijbij.

When and rank every object, the number of non-zeros in each score matrix and the maximum matrix inner product are both equal to . The reasons are that the score matrix diagonal elements are all 0 and that for all when . It is also straightforward to discern that a minimum matrix inner product of can be achieved only if does not contain ties and for all .

When or does not rank every object, for each such that either or the th score matrix row and column are set with all zeros, thereby decreasing the maximum/increasing the minimum matrix inner products by . Put otherwise, such a matrix inner product may be calculated as if the th row and column of both score matrices do not exist. Hence, the maximum and minimum inner products of and are reduced to and , respectively, where . Accordingly, a new correlation function can be derived to achieve the full expected correlation interval . It is named the scaled Kendall tau-extended correlation coefficient, written here succinctly as , and is defined as:

 ^τx(a,b) =∑ni=1∑nj=1aijbij¯n(¯n−1), (10)

which may be rewritten in terms of to explain the chosen nomenclature via the equation:

 ^τx(a,b) =n(n−1)¯n(¯n−1)τx(a,b), (11)

assuming the underlying score matrix of is given by Equation (9). Expressly, this alternative expression emphasizes that, by scaling by the factor , removes the impact of irrelevant pairwise preference comparisons—the pairs of objects unranked by , , or both—from the correlation. As a result, the correlation minimum and maximum values and 1 may be achieved when a non-strict incomplete ranking is compared with a suitable ranking. Precise details of these guarantees are deferred to the ensuing subsection (see Corollary 3) since they can be conveniently derived through the set of key theoretical connections therein established.

Clearly, when the Equation (11) scaling factor equals 1, meaning is equivalent to in space ; hence, in this restricted space Equation (10) becomes Equation (7). Furthermore, is equivalent to in space due to the equivalence between and in said space (Emond and Mason, 2002). Thus, since possesses the same advantages as and when the rankings are restricted to spaces and , respectively, it remains to demonstrate why is uniquely suited to deal with the broader space of non-strict incomplete rankings . Further theoretical foundations for this claim are established in the next subsection and additional practical reasons are given by the empirical results obtained in §5.3–§5.5.

### 4.3 Key Axiomatic-Distance and Correlation-Coefficient Pairings

This subsection proves that is equivalent to the normalized projected Kemeny-Snell distance , which is the unique distance satisfying the set of intuitive axioms exhibited in Appendix 7.1 Moreno-Centeno and Escobedo (2016) . Because is equivalent to , inherits the axioms for distance-based aggregation in space . Table 2 provides the corresponding axioms for correlation coefficient-based aggregation, which are further contextualized in Appendix 7.1. This subsection also proves the equivalence of another key axiomatic-distance and correlation-coefficient pairing in space . Together these results fill a significant gap in the literature because although Emond and Mason (2002) made a connection between distance and correlation-based methods for aggregating complete rankings (see Equation (8)), they conjectured that a parallel connection could not be established when dealing with incomplete rankings.

###### Theorem 1 (Linear transformation between ^τx and dNP−KS).

Let and be two arbitrary rankings over objects drawn from the space of non-strict incomplete rankings, . Then, the correlation coefficient and the distance are connected through the following equation:

 dNP−KS(a,b)=12−12^τx(a,b). (12)
###### Proof.

For succinctness, denote and as the rankings over objects obtained by projecting and onto the subset of objects ranked in common. Notice that and are complete rankings over the same reduced universe of objects (i.e., they lie in space relative to ). As such, using as the minimum distance unit, the corresponding and values for and are equated as follows (Emond and Mason, 2002):

 τx(¯a,¯b)=1−4dKS(¯a,¯b)¯n(¯n−1),

which expressed in terms of yields the equivalent relationship:

 dKS(¯a,¯b) =¯n(¯n−1)4−¯n(¯n−1)τx(¯a,¯b)4 (13) =¯n(¯n−1)4−¯n(¯n−1)∑¯ni=1∑¯nj=1¯aij¯bij4¯n(¯n−1) (14) =¯n(¯n−1)4−∑ni=1∑nj=1aijbij4, (15)

where Equation (14) applies the definition of (see Equation (7)) with respect to and ; and where Equation (15) cancels a common factor in the second term and utilizes the fact that unranked items in either ranking vector contribute nothing to the sum—that is the matrix inner products are identical in the original and projected spaces. Now, multiplying both sides of Equation (15) by gives:

 dKS(¯a,¯b)¯n(¯n−1)/2 =12−∑ni=1∑nj=1aijbij2¯n(¯n−1) ⇒dNP−KS(a,b)
###### Corollary 1.

The respective NIRAP optimization problems typified by and are equivalent and, thus, provide identical consensus rankings.

###### Proof.

This is established through the following series of equations:

 argminr∈ΩCK∑k=1dNP−KS(r,ak) =argmaxr∈ΩCK∑k=1−dNP−KS(r,ak) (16) =argmaxr∈ΩCK∑k=1−[12−12^τx(r,ak)] (17) =argmaxr∈ΩCK∑k=1^τx(r,ak), (18)

where the last equation results from the fact that scalars common to every term in the sum and constant terms do not impact the optimal solution.

It is also expedient to find an equivalent axiomatic distance corresponding to in space (recall that Emond and Mason (2002) proved that is equivalent to only in the restricted space ).

###### Theorem 2 (Linear transformation between τx and dP−KS).

Let and be two arbitrary rankings of objects drawn from the space of non-strict incomplete rankings . Then, the correlation coefficient and the distance are connected through the following equation:

 dP−KS(a,b)=¯n(¯n−1)4−n(n−1)4τx(a,b) (19)

where (i.e., the number of objects explicitly ranked by both and ).

###### Proof.

See Appendix 7.2. ∎

###### Corollary 2.

The respective NIRAP optimization problems typified by and are equivalent and, thus, provide identical consensus rankings.

###### Proof.

See Appendix 7.3. ∎

The connection between and (see Theorem 1 and Corollary 1) provides synergistic support for the appropriateness of each measure to handle a realistic variety of ranking data. The suitability of is reinforced by the fact that was purposely designed to attain the expected correlation interval when dealing with rankings from space (see §4.1). Inversely, these connections equip with a corresponding axiomatic foundation, from which its theoretical guarantees can be formally established. This includes the occurrence of the extrema correlation values .

###### Corollary 3.

Let . A maximum correlation value of is achieved if and only if and are the same ranking, and a minimum value of is achieved if and only is a linear ordering (i.e, it contains no ties) and is the reverse linear ordering of .

###### Proof.

See Appendix 7.4. ∎

In a nutshell, the above corollary results from a combination of Axioms 2 and 7. Moreover, by directly substituting in place of for the remaining five axioms listed in Appendix 7.1, a corresponding axiomatic foundation for is straightforward to obtain (see Table 2). Lastly, notice that since uniquely satisfies the distance-based axioms, the one-to-one relationship established by Theorem 1 and Corollary 3 together establish that uniquely satisfies the corresponding correlation-based axioms.

## 5 Comprehensive Solution of the Correlation-Based NIRAP

Many competitive group decision-making scenarios entail the allocation of a limited budget among a subset of participants proportional to the consensus ranking solution. In these scenarios it may be prudent to obtain the full set of alternative optimal solutions when multiple consensus rankings exist or to conclusively determine that the optimal solution is unique to avoid unfair and/or arbitrary outcomes. However, obtaining even just one consensus or median ranking via correlation-based methods (or the equivalent axiomatic distance-based methods) is an -hard problem (Bartholdi et al., 1989). What is more, owing to the immensity of the solution space—there are approximately possible non-strict complete rankings of objects (Gross, 1962)—the NIRAP has been solved largely via specialized algorithms rather than general-purpose integer programming techniques. This section develops a customized branch and bound algorithm (B&B) for fulfilling the aforementioned objectives efficiently given input rankings from space . In particular, B&B allows for an efficient exploration of the solution space by pruning unpromising branches (i.e. ordinal combinations) thereby avoiding full enumeration. It is important to highlight that B&B uses instead of because the former enables a much quicker evaluation of the candidate solutions and modifications performed during execution. This computational edge comes from the fact that the definition of contains non-linear terms (see Equations (3) and (5)) while that of is fully linear (see Equation (10)). Later in this section, B&B is implemented to compare the abilities of and to achieve egaliterianism and decisiveness—characterized by the propensity to yield unique or few alternative optima—when solving nontrivial NIRAP instances. To this end, a probabilistic approach for generating instances from space is also introduced.

### 5.1 A Customized Branch and Bound Algorithm

Emond and Mason Emond and Mason (2002) devised a branch and bound algorithm for solving the general ranking aggregation problem that relies on a succinct function of cumulative agreement between the set of input rankings and an iteratively evolving candidate-solution vector . In this respect offers a significant advantage over its distance counterpart since:

 K∑k=1τx(r,ak) =K∑k=1n∑i=1n∑j=1akijrijn(n−1)=1n(n−1)n∑i=1n∑j=1Aijrij, (20)

where and represent the score matrices of and , respectively; and where is defined as the combined input (CI) matrix. In particular, once the CI matrix is computed, the number of matrix inner products required to calculate cumulative agreement relative to any candidate solution are reduced from to one. Furthermore, this expedient data structure enables a form of sensitivity analysis for determining the increase/decrease in cumulative agreement that would result if the preference or ordinal relationships of a few objects in the candidate ranking are altered. Conversely, the cumulative distance function for (the axiomatic-distance counterpart of ) does not yield as wieldy of an expression due to the presence of nonlinear terms (see Equations (3) and (4)).

As Expression (20) indicates, the denominator is common to each correlation coefficient term , for . Therefore, it can be factored out of the cumulative correlation calculation and ignored in the corresponding optimization process. Conversely, each correlation coefficient term yields a different denominator equal to the number of objects ranked in common between and , thereby rendering Expression (20) and the current version of Emond and Mason’s algorithm inapplicable for . To address this issue, the following theorem introduces and proves the validity of a corresponding function of cumulative agreement.

###### Theorem 3 (Succinct function of cumulative agreement for ^τx).

Let , , and (the number of objects ranked by ), for . Then, the cumulative correlation between and can be computed according to the function:

 K∑k=1^τx(r,ak)=n∑i=1n∑j=1^Aijrij, (21)

where is the scaled combined input (SCI) matrix.

###### Proof.

Since , the term can be simplified as follows:

 ^τx(r,ak)=∑ni=1∑nj=1akijrij|Vak∩V|(|Vak∩V|−1)=n∑i=1n∑j=1akijrij¯nk(¯nk−1).

Thus, the denominator associated with each term is constant irrespective of the candidate-solution vector, thereby yielding the equivalent expressions:

 K∑k=1τx(r,ak)=K∑k=1n∑i=1n∑j=1akij¯nk(¯nk−1)rij=n∑i=1n∑j=1^Aijrij.□

Following the calculations of and , the steps of the branch and bound algorithms for and are identical; for the reader’s convenience, Figure 1 provides a flowchart of our featured algorithm B&B, which is complemented by the outline in the remainder of this paragraph. First, the absolute values of the SCI (CI, respectively) matrix entries are summed to yield an upper bound on the cumulative correlation achievable by any candidate-solution vector. An initial deviation penalty corresponding to a user-specified starting solution

(obtained randomly or via a heuristic) is then calculated by subtracting its objective value from said upper bound. To describe the ensuing steps, recall that

is the object-ordering induced by the mapping function (see §2). For , the algorithm calculates incremental penalties of fixing object (ranked th in the reference starting solution) to every possible pairwise preference relative to a candidate sub-ranking of objects by inspecting the respective SCI matrix entries. Three branches are created to reflect the possible ordinal relationships—i.e., preferred, tied, and dispreferred—between and each . If the incremental penalty of a branch exceeds the current minimum penalty, the branch is pruned; otherwise it is explored by considering the next object, . The algorithm prioritizes newly created branches when there are multiple branches to explore. Once a complete ranking is obtained, the minimum penalty is updated and the ranking is saved as a possible solution; at the end of the algorithm, all rankings with the final minimum penalty are returned as the set of optimal solutions (i.e., the median or consensus rankings).

### 5.2 Generation of Representative Instances from Space Ω

In the computational study carried out in (Moreno-Centeno and Escobedo, 2016), outperformed in yielding fewer alternative optima when solving instances with “predictable consensus rankings” of the distance-based NIRAP. Although these results seem to support the premise that is better suited than to solve the correlation-based NIRAP (due to Corollaries 1 and 2), their scope is limited based primarily on the restrictive types of instances therein considered. Specifically, since enumeration was employed to solve the problems exactly, each tested instance consisted of a maximum of 15 non-strict incomplete rankings of seven objects. Moreover, to generate the random instances used in their experiments, three simplistic templates were defined. The first two initialize every input ranking to a reference ranking —the all-ties and the identity permutation, respectively—and a third differs slightly in that it initializes a minority of the inputs to the reverse ranking of used in the second template—the inverse of the identity permutation. For all three templates incompleteness is then inserted to a random number and selection of objects within each initialized ranking. Such instances are not characteristic of most group decision-making settings mainly in that the incorporated level of individual disagreement (or lack thereof) is rather extreme. An alternative experimental design that considers more diverse forms of agreement/disagreement is hereby presented to compare the relative decisiveness and electoral fairness of and on a larger scope.

To generate test instances that are representative of different group decision-making scenarios, we devise a sampling approach based on Mallow’s -model of ranking data Mallows (1957), which is tailored to distance-based methods (Marden, 1996). The standard -model is parameterized by a reference or “ground truth” ranking and dispersion , which in conjunction with the

distance quantify the probability of observing a ranking

as:

 P(a)=P(a|a––,ϕ)=1ZϕdKS(a,a––),

where is the normalization constant. Since setting

to 1 yields the uniform distribution over space

and setting it nearer to 0 centers the distribution mass closer to , the dispersion parameter effectively controls the proximity of each generated ranking to the reference ranking (Lu and Boutilier, 2014). This means nontrivial instances with objectively defined degrees of collective similarity/dissimilarity can be obtained by sampling from this distribution using different dispersion values. Table 3 defines four types of simple real-world group decision-making scenarios according to different ranges of .

While Table 3 describes scenarios where every ranking is drawn using the same dispersion value, it is plausible to encounter situations where evaluations can be said to come from different ground truths or ranges of (i.e., where more than one type of group participates). Table 4 gives two types of complex scenarios to reflect such combinations. For the first scenario, two dispersions and exist over the same ground truth; is the dispersion of a majority of experts and is the dispersion for a minority of jumbled heterogeneous opinions (i.e., spammers). For the second scenario, two opposing ground truths exist,