A scoping review on multimodal deep learning in biomedical images and texts

07/14/2023
by   Zhaoyi Sun, et al.
0

Computer-assisted diagnostic and prognostic systems of the future should be capable of simultaneously processing multimodal data. Multimodal deep learning (MDL), which involves the integration of multiple sources of data, such as images and text, has the potential to revolutionize the analysis and interpretation of biomedical data. However, it only caught researchers' attention recently. To this end, there is a critical need to conduct a systematic review on this topic, identify the limitations of current work, and explore future directions. In this scoping review, we aim to provide a comprehensive overview of the current state of the field and identify key concepts, types of studies, and research gaps with a focus on biomedical images and texts joint learning, mainly because these two were the most commonly available data types in MDL research. This study reviewed the current uses of multimodal deep learning on five tasks: (1) Report generation, (2) Visual question answering, (3) Cross-modal retrieval, (4) Computer-aided diagnosis, and (5) Semantic segmentation. Our results highlight the diverse applications and potential of MDL and suggest directions for future research in the field. We hope our review will facilitate the collaboration of natural language processing (NLP) and medical imaging communities and support the next generation of decision-making and computer-assisted diagnostic system development.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2019

Multimodal Intelligence: Representation Learning, Information Fusion, and Applications

Deep learning has revolutionized speech recognition, image recognition, ...
research
04/11/2023

Advancing Medical Imaging with Language Models: A Journey from N-grams to ChatGPT

In this paper, we aimed to provide a review and tutorial for researchers...
research
02/10/2021

Biomedical Question Answering: A Comprehensive Review

Question Answering (QA) is a benchmark Natural Language Processing (NLP)...
research
10/16/2020

New Ideas and Trends in Deep Multimodal Content Understanding: A Review

The focus of this survey is on the analysis of two modalities of multimo...
research
12/16/2020

MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

We introduce a new dataset, MELINDA, for Multimodal biomEdicaL experImeN...
research
06/10/2019

Generation of Multimodal Justification Using Visual Word Constraint Model for Explainable Computer-Aided Diagnosis

The ambiguity of the decision-making process has been pointed out as the...
research
06/04/2021

Computer-Assisted Analysis of Biomedical Images

Nowadays, the amount of heterogeneous biomedical data is increasing more...

Please sign up or login with your details

Forgot password? Click here to reset