Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction

01/06/2023
by   Jun Gao, et al.
0

We present Mask-then-Fill, a flexible and effective data augmentation framework for event extraction. Our approach allows for more flexible manipulation of text and thus can generate more diverse data while keeping the original event structure unchanged as much as possible. Specifically, it first randomly masks out an adjunct sentence fragment and then infills a variable-length text span with a fine-tuned infilling model. The main advantage lies in that it can replace a fragment of arbitrary length in the text with another fragment of variable length, compared to the existing methods which can only replace a single word or a fixed-length fragment. On trigger and argument extraction tasks, the proposed framework is more effective than baseline methods and it demonstrates particularly strong results in the low-resource setting. Our further analysis shows that it achieves a good balance between diversity and distributional similarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2023

Boosting Event Extraction with Denoised Structure-to-Text Augmentation

Event extraction aims to recognize pre-defined event triggers and argume...
research
12/05/2020

Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation

Data augmentation is proven to be effective in many NLU tasks, especiall...
research
04/18/2023

TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models

Data augmentation has been established as an efficacious approach to sup...
research
05/04/2021

Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution

In this paper, we investigate the driving factors behind concatenation, ...
research
05/24/2023

STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models

Structure prediction tasks such as event extraction require an in-depth ...
research
10/16/2022

A Policy-based Approach to the SpecAugment Method for Low Resource E2E ASR

SpecAugment is a very effective data augmentation method for both HMM an...
research
09/11/2022

Improving Keyphrase Extraction with Data Augmentation and Information Filtering

Keyphrase extraction is one of the essential tasks for document understa...

Please sign up or login with your details

Forgot password? Click here to reset