A new direction to promote the implementation of artificial intelligence in natural clinical settings

Artificial intelligence (AI) researchers claim that they have made great `achievements' in clinical realms. However, clinicians point out the so-called `achievements' have no ability to implement into natural clinical settings. The root cause for this huge gap is that many essential features of natural clinical tasks are overlooked by AI system developers without medical background. In this paper, we propose that the clinical benchmark suite is a novel and promising direction to capture the essential features of the real-world clinical tasks, hence qualifies itself for guiding the development of AI systems, promoting the implementation of AI in real-world clinical practice.



There are no comments yet.


page 1

page 12


Clinical acceptance of software based on artificial intelligence technologies (radiology)

There is a methodological framework for the process of clinical trials o...

OpenClinicalAI: enabling AI to diagnose diseases in real-world clinical settings

This paper quantitatively reveals the state-of-the-art and state-of-the-...

An Overview on the Web of Clinical Data

In the last few years there has been an impressive growth of connections...

World Trade Center responders in their own words: Predicting PTSD symptom trajectories with AI-based language analyses of interviews

Background: Oral histories from 9/11 responders to the World Trade Cente...

The Algonauts Project: A Platform for Communication between the Sciences of Biological and Artificial Intelligence

In the last decade, artificial intelligence (AI) models inspired by the ...

Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems

Artificial intelligence (AI) has been successful at solving numerous pro...

Towards Realization of Augmented Intelligence in Dermatology: Advances and Future Directions

Artificial intelligence (AI) algorithms using deep learning have advance...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


  • [1] Yu, K.-H., Beam, A. L. & Kohane, I. S. Artificial intelligence in healthcare. Nature Biomedical Engineering 2, 719 (2018). A comprehensive review introduces the research progress of artificial intelligence in the realm of medicine.
  • [2] Koch, M. Artificial intelligence is becoming natural. Cell 173, 531–533 (2018).
  • [3] FDA, U. How to study and market your device. https://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/
    Accessed January 9, 2019.
  • [4] Vollmer, S. et al. Machine learning and ai research for patient benefit: 20 critical questions on transparency, replicability, ethics and effectiveness. arXiv preprint arXiv:1812.10404 (2018). The clinicians proposed twenty questions that span the entire lifecycle of AI system, from inception, data analysis, and model evaluation, to implementation.
  • [5] Miller, D. D. & Brown, E. W. Artificial intelligence in medical practice: the question to the answer? The American journal of medicine 131, 129–133 (2018).
  • [6] Park, S. H. & Han, K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 286, 800–809 (2018).
  • [7] Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452 (2015).
  • [8] Berner, E. S. et al. Performance of four computer-based diagnostic systems. New England Journal of Medicine 330, 1792–1796 (1994).
  • [9] Kononenko, I. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine 23, 89–109 (2001).
  • [10] Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115 (2017). A famous research used deep learning to identify skin cancer which caused extensive discussion.
  • [11] Capper, D. et al. Dna methylation-based classification of central nervous system tumours. Nature 555, 469 (2018).
  • [12] Bejnordi, B. E. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. Jama 318, 2199–2210 (2017).
  • [13] Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine 25, 65 (2019).
  • [14] Samad, M. D. et al. Predicting deterioration of ventricular function in patients with repaired tetralogy of fallot using machine learning. European Heart Journal-Cardiovascular Imaging 19, 730–738 (2018).
  • [15] Tison, G. H. et al. Passive detection of atrial fibrillation using a commercially available smartwatch. JAMA cardiology 3, 409–416 (2018).
  • [16] Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama 316, 2402–2410 (2016).
  • [17] Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122–1131 (2018).
  • [18] De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature medicine 24, 1342 (2018).
  • [19] All eyes are on ai. Nature Biomedical Engineering 2, 139 (2018).
  • [20] Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine 25, 44 (2019).
  • [21] Abràmoff, M. D., Lavin, P. T., Birch, M., Shah, N. & Folk, J. C. Pivotal trial of an autonomous ai-based diagnostic system for detection of diabetic retinopathy in primary care offices. npj Digital Medicine 1, 39 (2018). The first autonomous AI diagnostic system authorized by the FDA.
  • [22] Brocklehurst, P. et al. Computerised interpretation of fetal heart rate during labour (infant): a randomised controlled trial. The Lancet 389, 1719–1729 (2017).
  • [23] Kanagasingam, Y. et al. Evaluation of artificial intelligence–based grading of diabetic retinopathy in primary care. JAMA network open 1, e182665–e182665 (2018).
  • [24] Chen, J. H. & Asch, S. M. Machine learning and prediction in medicine—beyond the peak of inflated expectations. The New England journal of medicine 376, 2507 (2017).
  • [25] Gyawali, B. Does global oncology need artificial intelligence? The Lancet Oncology 19, 599–600 (2018).
  • [26] Fogel, A. L. & Kvedar, J. C. Artificial intelligence powers digital medicine. NPJ Digital Medicine 1, 5 (2018).
  • [27] Goldhahn, J., Rampton, V. & Spinas, G. A. Could artificial intelligence make doctors obsolete? Bmj 363, k4563 (2018).
  • [28] He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nature Medicine 25, 30–36 (2019).
  • [29] Maddox, T. M., Rumsfeld, J. S. & Payne, P. R. Questions for artificial intelligence in health care. Jama 321, 31–32 (2019).
  • [30] Anderson, C. Clinicians should cast a discerning eye on ai. Clinical OMICs 4, 10 (2017).
  • [31] Ribeiro, M. T., Singh, S. & Guestrin, C. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and
    data mining
    , 1135–1144 (ACM, 2016).
  • [32] Lee, H. et al. An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets. Nature Biomedical Engineering 3, 173 (2019).
  • [33] Mar, V. J., Scolyer, R. A. & Long, G. V. Computer-assisted diagnosis for skin cancer: have we been outsmarted? The Lancet 389, 1962–1964 (2017).
  • [34] Adamson, A. S. & Smith, A. Machine learning and health care disparities in dermatology. JAMA dermatology 154, 1247–1248 (2018).
  • [35] Zhang, Z. et al. Landscape of big medical data: A pragmatic survey on prioritized tasks. IEEE Access 1–1 (2019).
  • [36] Caspi, A. & Moffitt, T. E. All for one and one for all: Mental disorders in one dimension. American Journal of Psychiatry 175, 831–844 (2018).
  • [37] Mirnezami, R., Nicholson, J. & Darzi, A. Preparing for precision medicine. New England Journal of Medicine 366, 489–491 (2012).
  • [38] Parikh, R. B., Obermeyer, Z. & Navathe, A. S. Regulation of predictive analytics in medicine. Science 363, 810–812 (2019).
  • [39] Fleming, P. J. & Wallace, J. J. How not to lie with statistics: the correct way to summarize benchmark results. Communications of the ACM 29, 218–221 (1986).
  • [40] Guthaus, M. R. et al. Mibench: A free, commercially representative embedded benchmark suite. In Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop on, 3–14 (IEEE, 2001).
  • [41] Wang, L. et al. Bigdatabench: A big data benchmark suite from internet services. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium
    , 488–499 (IEEE, 2014).
  • [42] Geiger, A., Lenz, P. & Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 3354–3361 (IEEE, 2012).
  • [43] Mittelman, M., Markham, S. & Taylor, M. Patient commentary: Stop hyping artificial intelligence—patients will always need human doctors. Bmj 363, k4669 (2018).