Sources of performance variability in deep learning-based polyp detection

by   Thuy Nuong Tran, et al.

Validation metrics are a key prerequisite for the reliable tracking of scientific progress and for deciding on the potential clinical translation of methods. While recent initiatives aim to develop comprehensive theoretical frameworks for understanding metric-related pitfalls in image analysis problems, there is a lack of experimental evidence on the concrete effects of common and rare pitfalls on specific applications. We address this gap in the literature in the context of colon cancer screening. Our contribution is twofold. Firstly, we present the winning solution of the Endoscopy computer vision challenge (EndoCV) on colon cancer detection, conducted in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2022. Secondly, we demonstrate the sensitivity of commonly used metrics to a range of hyperparameters as well as the consequences of poor metric choices. Based on comprehensive validation studies performed with patient data from six clinical centers, we found all commonly applied object detection metrics to be subject to high inter-center variability. Furthermore, our results clearly demonstrate that the adaptation of standard hyperparameters used in the computer vision community does not generally lead to the clinically most plausible results. Finally, we present localization criteria that correspond well to clinical relevance. Our work could be a first step towards reconsidering common validation strategies in automatic colon cancer screening applications.


Is the winner really the best? A critical analysis of common research practice in biomedical image analysis competitions

International challenges have become the standard for validation of biom...

A Comprehensive Evaluation Study on Risk Level Classification of Melanoma by Computer Vision on ISIC 2016-2020 Datasets

Skin cancer is the most common type of cancer. Specifically, melanoma is...

Recent trends and analysis of Generative Adversarial Networks in Cervical Cancer Imaging

Cervical cancer is one of the most common types of cancer found in femal...

Application-driven Validation of Posteriors in Inverse Problems

Current deep learning-based solutions for image analysis tasks are commo...

A Review of Generative Adversarial Networks in Cancer Imaging: New Applications, New Solutions

Despite technological and medical advances, the detection, interpretatio...

Computer Vision-aided Atom Tracking in STEM Imaging

To address the SMC'17 data challenge -- "Data mining atomically resolved...

Hand tracking for clinical applications: validation of the Google MediaPipe Hand (GMH) and the depth-enhanced GMH-D frameworks

Accurate 3D tracking of hand and fingers movements poses significant cha...

Please sign up or login with your details

Forgot password? Click here to reset