Utilizing coarse-grained data in low-data settings for event extraction

05/11/2022
by   Osman Mutlu, et al.
0

Annotating text data for event information extraction systems is hard, expensive, and error-prone. We investigate the feasibility of integrating coarse-grained data (document or sentence labels), which is far more feasible to obtain, instead of annotating more documents. We utilize a multi-task model with two auxiliary tasks, document and sentence binary classification, in addition to the main task of token classification. We perform a series of experiments with varying data regimes for the aforementioned integration. Results show that while introducing extra coarse-grained data offers greater improvement and robustness, a gain is still possible with only the addition of negative documents that have no information on any event.

READ FULL TEXT

page 30

page 31

page 32

page 35

page 36

research
09/06/2022

Few-Shot Document-Level Event Argument Extraction

Event argument extraction (EAE) has been well studied at the sentence le...
research
08/16/2021

An Effective System for Multi-format Information Extraction

The multi-format information extraction task in the 2021 Language and In...
research
04/28/2023

CED: Catalog Extraction from Documents

Sentence-by-sentence information extraction from long documents is an ex...
research
06/16/2020

Weakly-supervised Domain Adaption for Aspect Extraction via Multi-level Interaction Transfer

Fine-grained aspect extraction is an essential sub-task in aspect based ...
research
07/31/2023

Towards General Visual-Linguistic Face Forgery Detection

Deepfakes are realistic face manipulations that can pose serious threats...
research
05/25/2020

AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Authors of research papers in the fields of mathematics, and other math-...

Please sign up or login with your details

Forgot password? Click here to reset