Multi-rater delta: extending the delta nominal measure of agreement between two raters to many raters
The need to measure the degree of agreement among R raters who independently classify n subjects within K nominal categories is frequent in many scientific areas. The most popular measures are Cohen's kappa (R = 2), Fleiss' kappa, Conger's kappa and Hubert's kappa (R ≥ 2) coefficients, which have several defects. In 2004, the delta coefficient was defined for the case of R = 2, which did not have the defects of Cohen's kappa coefficient. This article extends the coefficient delta from R = 2 raters to R ≥ 2. The coefficient multi-rater delta has the same advantages as the coefficient delta with regard to the type kappa coefficients: i) it is intuitive and easy to interpret, because it refers to the proportion of replies that are concordant and non random; ii) the summands which give its value allow the degree of agreement in each category to be measured accurately, with no need to be collapsed; and iii) it is not affected by the marginal imbalance.
READ FULL TEXT