Integrating diverse extraction pathways using iterative predictions for Multilingual Open Information Extraction

10/15/2021
by   Bhushan Kotnis, et al.
0

In this paper we investigate a simple hypothesis for the Open Information Extraction (OpenIE) task, that it may be easier to extract some elements of an triple if the extraction is conditioned on prior extractions which may be easier to extract. We successfully exploit this and propose a neural multilingual OpenIE system that iteratively extracts triples by conditioning extractions on different elements of the triple leading to a rich set of extractions. The iterative nature of MiLIE also allows for seamlessly integrating rule based extraction systems with a neural end-to-end system leading to improved performance. MiLIE outperforms SOTA systems on multiple languages ranging from Chinese to Galician thanks to it's ability of combining multiple extraction pathways. Our analysis confirms that it is indeed true that certain elements of an extraction are easier to extract than others. Finally, we introduce OpenIE evaluation datasets for two low resource languages namely Japanese and Galician.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2022

CaMEL: Case Marker Extraction without Labels

We introduce CaMEL (Case Marker Extraction without Labels), a novel and ...
research
02/09/2023

Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian Languages

Massive knowledge graphs like Wikidata attempt to capture world knowledg...
research
06/30/2022

Domain Adaptive Pretraining for Multilingual Acronym Extraction

This paper presents our findings from participating in the multilingual ...
research
05/17/2020

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

While traditional systems for Open Information Extraction were statistic...
research
05/20/2022

Multilingual Normalization of Temporal Expressions with Masked Language Models

The detection and normalization of temporal expressions is an important ...
research
07/04/2017

Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity

Rule-based information extraction has lately received a fair amount of a...
research
10/19/2021

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

End-to-end TTS suffers from high data requirements as it is difficult fo...

Please sign up or login with your details

Forgot password? Click here to reset