GENEVA: Pushing the Limit of Generalizability for Event Argument Extraction with 100+ Event Types

05/25/2022
by   Tanmay Parekh, et al.
0

Numerous events occur worldwide and are documented in the news, social media, and various online platforms in raw text. Extracting useful and succinct information about these events is crucial to various downstream applications. Event Argument Extraction (EAE) deals with the task of extracting event-specific information from natural language text. In order to cater to new events and domains in a realistic low-data setting, there is a growing urgency for EAE models to be generalizable. Consequentially, there is a necessity for benchmarking setups to evaluate the generalizability of EAE models. But most existing benchmarking datasets like ACE and ERE have limited coverage in terms of events and cannot adequately evaluate the generalizability of EAE models. To alleviate this issue, we introduce a new dataset GENEVA covering a diverse range of 115 events and 187 argument roles. Using this dataset, we create four benchmarking test suites to assess the model's generalization capability from different perspectives. We benchmark various representative models on these test suites and compare their generalizability relatively. Finally, we propose a new model SCAD that outperforms the previous models and serves as a strong benchmark for these test suites.

READ FULL TEXT

page 12

page 13

research
11/02/2022

Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset

Event extraction (EE) is crucial to downstream tasks such as new aggrega...
research
07/21/2021

COfEE: A Comprehensive Ontology for Event Extraction from text, with an online annotation tool

Data is published on the web over time in great volumes, but majority of...
research
05/18/2021

The Commodities News Corpus: A Resource forUnderstanding Commodity News Better

Commodity News contains a wealth of information such as sum-mary of the ...
research
10/25/2022

pmuBAGE: The Benchmarking Assortment of Generated PMU Data for Power System Events

This paper introduces pmuGE (phasor measurement unit Generator of Events...
research
04/03/2022

pmuBAGE: The Benchmarking Assortment of Generated PMU Data for Power System Events – Part I: Overview and Results

We present pmuGE (phasor measurement unit Generator of Events), one of t...
research
04/11/2016

Using Sentence-Level LSTM Language Models for Script Inference

There is a small but growing body of research on statistical scripts, mo...
research
04/06/2022

Improving Zero-Shot Event Extraction via Sentence Simplification

The success of sites such as ACLED and Our World in Data have demonstrat...

Please sign up or login with your details

Forgot password? Click here to reset