Inspecting state of the art performance and NLP metrics in image-based medical report generation

11/18/2020
by   Pablo Pino, et al.
15

Several deep learning architectures have been proposed over the last years to deal with the problem of generating a written report given an imaging exam as input. Most works evaluate the generated reports using standard Natural Language Processing (NLP) metrics (e.g. BLEU, ROUGE), reporting significant progress. In this article, we contrast this progress by comparing state of the art (SOTA) models against weak baselines. We show that simple and even naive approaches yield near SOTA performance on most traditional NLP metrics. We conclude that evaluation methods in this task should be further studied towards correctly measuring clinical accuracy, ideally involving physicians to contribute to this end.

READ FULL TEXT

page 1

page 2

page 3

research
10/20/2020

A Survey on Deep Learning and Explainability for Automatic Image-based Medical Report Generation

Every year physicians face an increasing demand of image-based diagnosis...
research
08/12/2016

Measuring the State of the Art of Automated Pathway Curation Using Graph Algorithms - A Case Study of the mTOR Pathway

This paper evaluates the difference between human pathway curation and c...
research
04/25/2022

A global analysis of metrics used for measuring performance in natural language processing

Measuring the performance of natural language processing models is chall...
research
05/06/2019

Caveats in Generating Medical Imaging Labels from Radiology Reports

Acquiring high-quality annotations in medical imaging is usually a costl...
research
10/20/2020

Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation

Neural image-to-text radiology report generation systems offer the poten...
research
10/27/2020

On the diminishing return of labeling clinical reports

Ample evidence suggests that better machine learning models may be stead...
research
08/02/2021

Underreporting of errors in NLG output, and what to do about it

We observe a severe under-reporting of the different kinds of errors tha...

Please sign up or login with your details

Forgot password? Click here to reset