The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation

06/12/2023
by   Hao Peng, et al.
0

Event extraction (EE) is a crucial task aiming at extracting events from texts, which includes two subtasks: event detection (ED) and event argument extraction (EAE). In this paper, we check the reliability of EE evaluations and identify three major pitfalls: (1) The data preprocessing discrepancy makes the evaluation results on the same dataset not directly comparable, but the data preprocessing details are not widely noted and specified in papers. (2) The output space discrepancy of different model paradigms makes different-paradigm EE models lack grounds for comparison and also leads to unclear mapping issues between predictions and annotations. (3) The absence of pipeline evaluation of many EAE-only works makes them hard to be directly compared with EE works and may not well reflect the model performance in real-world pipeline scenarios. We demonstrate the significant influence of these pitfalls through comprehensive meta-analyses of recent papers and empirical experiments. To avoid these pitfalls, we suggest a series of remedies, including specifying data preprocessing, standardizing outputs, and providing pipeline evaluation results. To help implement these remedies, we develop a consistent evaluation framework OMNIEVENT, which can be obtained from https://github.com/THU-KEG/OmniEvent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

MAILEX: Email Event and Argument Extraction

In this work, we present the first dataset, , for performing event extra...
research
01/13/2021

EventPlus: A Temporal Event Understanding Pipeline

We present EventPlus, a temporal event understanding pipeline that integ...
research
06/01/2023

Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?

Event co-occurrences have been proved effective for event extraction (EE...
research
04/28/2020

Event Extraction by Answering (Almost) Natural Questions

The problem of event extraction requires detecting the event trigger and...
research
04/17/2023

LED: A Dataset for Life Event Extraction from Dialogs

Lifelogging has gained more attention due to its wide applications, such...
research
01/14/2023

: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation

The HuggingFace Datasets Hub hosts thousands of datasets. This provides ...
research
07/03/2019

Quantitative evaluation of sense of discrepancy to operation response using event-related potential

This study aimed to develop a method to evaluate the sense of discrepanc...

Please sign up or login with your details

Forgot password? Click here to reset