RED^ FM: a Filtered and Multilingual Relation Extraction Dataset

06/16/2023
by   Pere-Lluís Huguet Cabot, et al.
0

Relation Extraction (RE) is a task that identifies relationships between entities in a text, enabling the acquisition of relational facts and bridging the gap between natural language and structured knowledge. However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English. In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems. First, we present SRED^ FM, an automatically annotated dataset covering 18 languages, 400 relation types, 13 entity types, totaling more than 40 million triplet instances. Second, we propose RED^ FM, a smaller, human-revised dataset for seven languages that allows for the evaluation of multilingual RE systems. To demonstrate the utility of these novel datasets, we experiment with the first end-to-end multilingual RE model, mREBEL, that extracts triplets, including entity types, in multiple languages. We release our resources and model checkpoints at https://www.github.com/babelscape/rebel

READ FULL TEXT

page 3

page 6

page 13

page 14

research
04/17/2021

DiS-ReX: A Multilingual Dataset for Distantly Supervised Relation Extraction

Distant supervision (DS) is a well established technique for creating la...
research
01/11/2023

Multilingual Entity and Relation Extraction from Unified to Language-specific Training

Entity and relation extraction is a key task in information extraction, ...
research
08/14/2019

X-WikiRE: A Large, Multilingual Resource for Relation Extraction asMachine Comprehension

Although the vast majority of knowledge bases KBs are heavily biased tow...
research
08/14/2019

X-WikiRE: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension

Although the vast majority of knowledge bases KBs are heavily biased tow...
research
05/08/2023

MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset

Relation extraction (RE) is a fundamental task in information extraction...
research
01/28/2021

LOME: Large Ontology Multilingual Extraction

We present LOME, a system for performing multilingual information extrac...
research
09/23/2017

Language Independent Acquisition of Abbreviations

This paper addresses automatic extraction of abbreviations (encompassing...

Please sign up or login with your details

Forgot password? Click here to reset