Towards Reducing Manual Workload in Technology-Assisted Reviews: Estimating Ranking Performance
Conducting a systematic review (SR) is comprised of multiple tasks: (i) collect documents (studies) that are likely to be relevant from digital libraries (eg., PubMed), (ii) manually read and label the documents as relevant or irrelevant, (iii) extract information from the relevant studies, and (iv) analyze and synthesize the information and derive a conclusion of SR. When researchers label studies, they can screen ranked documents where relevant documents are higher than irrelevant ones. This practice, known as screening prioritization (ie., document ranking approach), speeds up the process of conducting a SR as the documents labelled as relevant can move to the next tasks earlier. However, the approach is limited in reducing the manual workload because the total number of documents to screen remains the same. Towards reducing the manual workload in the screening process, we investigate the quality of document ranking of SR. This can signal researchers whereabouts in the ranking relevant studies are located and let them decide where to stop the screening. After extensive analysis on SR document rankings from different ranking models, we hypothesize 'topic broadness' as a factor that affects the ranking quality of SR. Finally, we propose a measure that estimates the topic broadness and demonstrate that the proposed measure is a simple yet effective method to predict the qualities of document rankings for SRs.
READ FULL TEXT