Is artificial data useful for biomedical Natural Language Processing

07/01/2019
by   Zixu Wang, et al.
0

A major obstacle to the development of Natural Language Processing (NLP) methods in the biomedical domain is data accessibility. This problem can be addressed by generating medical data artificially. Most previous studies have focused on the generation of short clinical text, and evaluation of the data utility has been limited. We propose a generic methodology to guide the generation of clinical text with key phrases. We use the artificial data as additional training data in two key biomedical NLP tasks: text classification and temporal relation extraction. We show that artificially generated training data used in conjunction with real training data can lead to performance boosts for data-greedy neural network algorithms. We also demonstrate the usefulness of the generated data for NLP setups where it fully replaces real training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2021

How May I Help You? Using Neural Text Simplification to Improve Downstream NLP Tasks

The general goal of text simplification (TS) is to reduce text complexit...
research
05/10/2017

A Biomedical Information Extraction Primer for NLP Researchers

Biomedical Information Extraction is an exciting field at the crossroads...
research
04/21/2022

Few-shot learning for medical text: A systematic review

Objective: Few-shot learning (FSL) methods require small numbers of labe...
research
08/10/2023

LASIGE and UNICAGE solution to the NASA LitCoin NLP Competition

Biomedical Natural Language Processing (NLP) tends to become cumbersome ...
research
02/03/2021

Memorization vs. Generalization: Quantifying Data Leakage in NLP Performance Evaluation

Public datasets are often used to evaluate the efficacy and generalizabi...
research
10/27/2020

On the diminishing return of labeling clinical reports

Ample evidence suggests that better machine learning models may be stead...
research
04/05/2022

Design considerations for a hierarchical semantic compositional framework for medical natural language understanding

Medical natural language processing (NLP) systems are a key enabling tec...

Please sign up or login with your details

Forgot password? Click here to reset