Response to Moffat's Comment on "Towards Meaningful Statements in IR Evaluation: Mapping Evaluation Measures to Interval Scales"

12/22/2022
by   Marco Ferrante, et al.
0

Moffat recently commented on our previous work. Our work focused on how laying the foundations of our evaluation methodology into the theory of measurement can improve our knowledge and understanding of the evaluation measures we use in IR and how it can shed light on the different types of scales adopted by our evaluation measures; we also provided evidence, through extensive experimentation, on the impact of the different types of scales on the statistical analyses, as well as on the impact of departing from their assumptions. Moreover, we investigated, for the first time in IR, the concept of meaningfulness, i.e. the invariance of the experimental statements and inferences you draw, and proposed it as a way to ensure more valid and generalizabile results. Moffat's comments build on: (i) misconceptions about the representational theory of measurement, such as what an interval scale actually is and what axioms it has to comply with; (ii) they totally miss the central concept of meaningfulness. Therefore, we reply to Moffat's comments by properly framing them in the representational theory of measurement and in the concept of meaningfulness. All in all, we can only reiterate what we said several times: the goal of this research line is to theoretically ground our evaluation methodology - and IR is a field where it is extremely challenging to perform any theoretical advances - in order to aim for more robust and generalizable inferences - something we currently lack in the field. Possibly there are other and better ways to achieve this objective and these proposals could emerge from an open discussion in the field and from the work of others. On the other hand, reducing everything to a contrast on what is (or pretend to be) an interval scale or whether all or none evaluation measures are interval scales may be more a barrier from than a help in progressing towards this goal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2023

A comment to "A General Theory of IR Evaluation Measures"

The paper "A General Theory of IR Evaluation Measures" develops a formal...
research
01/07/2021

Towards Meaningful Statements in IR Evaluation. Mapping Evaluation Measures to Interval Scales

Recently, it was shown that most popular IR measures are not interval-sc...
research
07/07/2022

Batch Evaluation Metrics in Information Retrieval: Measures, Scales, and Meaning

A sequence of recent papers has considered the role of measurement scale...
research
11/26/2021

Streamlining Evaluation with ir-measures

We present ir-measures, a new tool that makes it convenient to calculate...
research
01/31/2019

An InfoVis Tool for Interactive Component-Based Evaluation

In this paper, we present an InfoVis tool based on Sankey diagrams for t...
research
09/16/2020

Capturing Individuals' Uncertainties–On Establishing the Validity of an Interval-Valued Survey Response Mode

Obtaining quantitative survey responses that are both accurate and infor...

Please sign up or login with your details

Forgot password? Click here to reset