What do You Mean by Relation Extraction? A Survey on Datasets and Study on Scientific Relation Classification

04/28/2022
by   Elisa Bassignana, et al.
0

Over the last five years, research on Relation Extraction (RE) witnessed extensive progress with many new dataset releases. At the same time, setup clarity has decreased, contributing to increased difficulty of reliable empirical evaluation (Taillé et al., 2020). In this paper, we provide a comprehensive survey of RE datasets, and revisit the task definition and its adoption by the community. We find that cross-dataset and cross-domain setups are particularly lacking. We present an empirical study on scientific Relation Classification across two datasets. Despite large data overlap, our analysis reveals substantial discrepancies in annotation. Annotation discrepancies strongly impact Relation Classification performance, explaining large drops in cross-dataset evaluations. Variation within further sub-domains exists but impacts Relation Classification only to limited degrees. Overall, our study calls for more rigour in reporting setups in RE and evaluation across multiple test sets.

READ FULL TEXT

page 15

page 17

research
10/17/2022

CrossRE: A Cross-Domain Dataset for Relation Extraction

Relation Extraction (RE) has attracted increasing attention, but current...
research
02/18/2021

WebRED: Effective Pretraining And Finetuning For Relation Extraction On The Web

Relation extraction is used to populate knowledge bases that are importa...
research
09/22/2020

Let's Stop Incorrect Comparisons in End-to-end Relation Extraction!

Despite efforts to distinguish three different evaluation setups (Bekoul...
research
10/19/2022

CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

Enterprise relation extraction aims to detect pairs of enterprise entiti...
research
09/05/2021

Semi-Automated Labeling of Requirement Datasets for Relation Extraction

Creating datasets manually by human annotators is a laborious task that ...
research
10/16/2019

FewRel 2.0: Towards More Challenging Few-Shot Relation Classification

We present FewRel 2.0, a more challenging task to investigate two aspect...
research
04/16/2021

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

TACRED is one of the largest and most widely used sentence-level relatio...

Please sign up or login with your details

Forgot password? Click here to reset