Exploring semantic information in disease: Simple Data Augmentation Techniques for Chinese Disease Normalization

06/02/2023
by   Wenqian Cui, et al.
0

The disease is a core concept in the medical field, and the task of normalizing disease names is the basis of all disease-related tasks. However, due to the multi-axis and multi-grain nature of disease names, incorrect information is often injected and harms the performance when using general text data augmentation techniques. To address the above problem, we propose a set of data augmentation techniques that work together as an augmented training task for disease normalization. Our data augmentation methods are based on both the clinical disease corpus and standard disease corpus derived from ICD-10 coding. Extensive experiments are conducted to show the effectiveness of our proposed methods. The results demonstrate that our methods can have up to 3% performance gain compared to non-augmented counterparts, and they can work even better on smaller datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2021

Text Augmentation in a Multi-Task View

Traditional data augmentation aims to increase the coverage of the input...
research
02/02/2023

How to choose "Good" Samples for Text Data Augmentation

Deep learning-based text classification models need abundant labeled dat...
research
04/30/2021

Adapting Coreference Resolution for Processing Violent Death Narratives

Coreference resolution is an important component in analyzing narrative ...
research
08/08/2023

I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection

Simile detection is a valuable task for many natural language processing...
research
04/30/2022

Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

Existing Math Word Problem (MWP) solvers have achieved high accuracy on ...
research
09/11/2022

Improving Keyphrase Extraction with Data Augmentation and Information Filtering

Keyphrase extraction is one of the essential tasks for document understa...
research
05/13/2020

ODVICE: An Ontology-Driven Visual Analytic Tool for Interactive Cohort Extraction

Increased availability of electronic health records (EHR) has enabled re...

Please sign up or login with your details

Forgot password? Click here to reset