Recommendations on test datasets for evaluating AI solutions in pathology

by   André Homeyer, et al.

Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations for the collection of test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help regulatory agencies and end users verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future.


"Happy and Assured that life will be easy 10years from now.": Perceptions of Artificial Intelligence in 8 Countries

As the influence and use of artificial intelligence (AI) have grown and ...

Lessons Learned from Designing an AI-Enabled Diagnosis Tool for Pathologists

Despite the promises of data-driven artificial intelligence (AI), little...

Ethics in AI through the Developer's Prism: A Socio-Technical Grounded Theory Literature Review and Guidelines

The term 'ethics' is widely used, explored, and debated in the context o...

Artificial Intelligence in Ovarian Cancer Histopathology: A Systematic Review

Purpose - To characterise and assess the quality of published research e...

Survey of XAI in digital pathology

Artificial intelligence (AI) has shown great promise for diagnostic imag...

Data Quality, Mismatched Expectations, and Moving Requirements: The Challenges of User-Centred Dashboard Design

Interactive information dashboards can help both specialists and the gen...

Please sign up or login with your details

Forgot password? Click here to reset