Highdicom: A Python library for standardized encoding of image annotations and machine learning model outputs in pathology and radiology

06/14/2021
by   Christopher P. Bridge, et al.
0

Machine learning is revolutionizing image-based diagnostics in pathology and radiology. ML models have shown promising results in research settings, but their lack of interoperability has been a major barrier for clinical integration and evaluation. The DICOM a standard specifies Information Object Definitions and Services for the representation and communication of digital images and related information, including image-derived annotations and analysis results. However, the complexity of the standard represents an obstacle for its adoption in the ML community and creates a need for software libraries and tools that simplify working with data sets in DICOM format. Here we present the highdicom library, which provides a high-level application programming interface for the Python programming language that abstracts low-level details of the standard and enables encoding and decoding of image-derived information in DICOM format in a few lines of Python code. The highdicom library ties into the extensive Python ecosystem for image processing and machine learning. Simultaneously, by simplifying creation and parsing of DICOM-compliant files, highdicom achieves interoperability with the medical imaging systems that hold the data used to train and run ML models, and ultimately communicate and store model outputs for clinical use. We demonstrate through experiments with slide microscopy and computed tomography imaging, that, by bridging these two ecosystems, highdicom enables developers to train and evaluate state-of-the-art ML models in pathology and radiology while remaining compliant with the DICOM standard and interoperable with clinical systems at all stages. To promote standardization of ML research and streamline the ML model development and deployment process, we made the library available free and open-source.

READ FULL TEXT

page 2

page 4

page 16

page 20

page 21

page 22

page 23

page 24

research
01/18/2022

Studying Popular Open Source Machine Learning Libraries and Their Cross-Ecosystem Bindings

Open source machine learning (ML) libraries allow developers to integrat...
research
06/06/2017

ChemKED: a human- and machine-readable data standard for chemical kinetics experiments

Fundamental experimental measurements of quantities such as ignition del...
research
05/24/2022

Pynblint: a Static Analyzer for Python Jupyter Notebooks

Jupyter Notebook is the tool of choice of many data scientists in the ea...
research
11/11/2022

Anonymization of Whole Slide Images in Histopathology for Research and Education

Objective: The exchange of health-related data is subject to regional la...
research
07/23/2019

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

SciPy is an open source scientific computing library for the Python prog...
research
07/14/2022

problexity – an open-source Python library for binary classification problem complexity assessment

The classification problem's complexity assessment is an essential eleme...
research
03/15/2023

Building an Effective Email Spam Classification Model with spaCy

Today, people use email services such as Gmail, Outlook, AOL Mail, etc. ...

Please sign up or login with your details

Forgot password? Click here to reset