Outcome-based Evaluation of Systematic Review Automation

06/30/2023
by   Wojciech Kusa, et al.
0

Current methods of evaluating search strategies and automated citation screening for systematic literature reviews typically rely on counting the number of relevant and not relevant publications. This established practice, however, does not accurately reflect the reality of conducting a systematic review, because not all included publications have the same influence on the final outcome of the systematic review. More specifically, if an important publication gets excluded or included, this might significantly change the overall review outcome, while not including or excluding less influential studies may only have a limited impact. However, in terms of evaluation measures, all inclusion and exclusion decisions are treated equally and, therefore, failing to retrieve publications with little to no impact on the review outcome leads to the same decrease in recall as failing to retrieve crucial publications. We propose a new evaluation framework that takes into account the impact of the reported study on the overall systematic review outcome. We demonstrate the framework by extracting review meta-analysis data and estimating outcome effects using predictions from ranking runs on systematic reviews of interventions from CLEF TAR 2019 shared task. We further measure how closely the obtained outcomes are to the outcomes of the original review if the arbitrary rankings were used. We evaluate 74 runs using the proposed framework and compare the results with those obtained using standard IR measures. We find that accounting for the difference in review outcomes leads to a different assessment of the quality of a system than if traditional evaluation measures were used. Our analysis provides new insights into the evaluation of retrieval results in the context of systematic review automation, emphasising the importance of assessing the usefulness of each document beyond binary relevance.

READ FULL TEXT
research
12/18/2022

Neural Rankers for Effective Screening Prioritisation in Medical Systematic Review Literature Search

Medical systematic reviews typically require assessing all the documents...
research
09/20/2017

A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updates

Clinical trial registries can be used to monitor the production of trial...
research
12/08/2021

Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

Screening or assessing studies is critical to the quality and outcomes o...
research
07/06/2022

A Knowledge Graph-Based Method for Automating Systematic Literature Reviews

Systematic Literature Reviews aim at investigating current approaches to...
research
05/08/2023

The impact and applications of ChatGPT: a systematic review of literature reviews

The conversational artificial-intelligence (AI) technology ChatGPT has b...
research
02/15/2022

Characterising Cybercriminals: A Review

This review provides an overview of current research on the known charac...
research
06/11/2022

Linking political exposures to child and maternal health outcomes: a realist review

Background Conceptual and theoretical links between politics and public...

Please sign up or login with your details

Forgot password? Click here to reset