Adapting Coreference Resolution for Processing Violent Death Narratives

04/30/2021
by   Ankith Uppunda, et al.
0

Coreference resolution is an important component in analyzing narrative text from administrative data (e.g., clinical or police sources). However, existing coreference models trained on general language corpora suffer from poor transferability due to domain gaps, especially when they are applied to gender-inclusive data with lesbian, gay, bisexual, and transgender (LGBT) individuals. In this paper, we analyzed the challenges of coreference resolution in an exemplary form of administrative text written in English: violent death narratives from the USA's Centers for Disease Control's (CDC) National Violent Death Reporting System. We developed a set of data augmentation rules to improve model performance using a probabilistic data programming framework. Experiments on narratives from an administrative database, as well as existing gender-inclusive coreference datasets, demonstrate the effectiveness of data augmentation in training coreference models that can better handle text data about LGBT individuals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2023

Performance of Data Augmentation Methods for Brazilian Portuguese Text Classification

Improving machine learning performance while increasing model generaliza...
research
06/02/2023

Exploring semantic information in disease: Simple Data Augmentation Techniques for Chinese Disease Normalization

The disease is a core concept in the medical field, and the task of norm...
research
03/26/2023

Analyzing Effects of Mixed Sample Data Augmentation on Model Interpretability

Data augmentation strategies are actively used when training deep neural...
research
06/12/2023

Gender-Inclusive Grammatical Error Correction through Augmentation

In this paper we show that GEC systems display gender bias related to th...
research
10/11/2020

PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation

De-identification is the task of identifying protected health informatio...
research
03/16/2023

Investigating Failures to Generalize for Coreference Resolution Models

Coreference resolution models are often evaluated on multiple datasets. ...
research
01/02/2021

SDA: Improving Text Generation with Self Data Augmentation

Data augmentation has been widely used to improve deep neural networks i...

Please sign up or login with your details

Forgot password? Click here to reset