IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

10/25/2022
by   Chenguang Wang, et al.
0

We introduce a new open information extraction (OIE) benchmark for pre-trained language models (LM). Recent studies have demonstrated that pre-trained LMs, such as BERT and GPT, may store linguistic and relational knowledge. In particular, LMs are able to answer “fill-in-the-blank” questions when given a pre-defined relation category. Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs. We accomplish this by turning pre-trained LMs into zero-shot OIE systems. Surprisingly, pre-trained LMs are able to obtain competitive performance on both standard OIE datasets (CaRB and Re-OIE2016) and two new large-scale factual OIE datasets (TAC KBP-OIE and Wikidata-OIE) that we establish via distant supervision. For instance, the zero-shot pre-trained LMs outperform the F1 score of the state-of-the-art supervised OIE methods on our factual OIE datasets without needing to use any training sets. Our code and datasets are available at https://github.com/cgraywang/IELM

READ FULL TEXT
research
07/17/2022

ELECTRA is a Zero-Shot Learner, Too

Recently, for few-shot or even zero-shot learning, the new paradigm "pre...
research
01/07/2021

Ask2Transformers: Zero-Shot Domain labelling with Pre-trained Language Models

In this paper we present a system that exploits different pre-trained La...
research
05/02/2022

OPT: Open Pre-trained Transformer Language Models

Large language models, which are often trained for hundreds of thousands...
research
07/10/2023

Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation

Large pre-trained language models achieve impressive results across many...
research
12/15/2021

Is "my favorite new movie" my favorite movie? Probing the Understanding of Recursive Noun Phrases

Recursive noun phrases (NPs) have interesting semantic properties. For e...
research
07/07/2022

A Large Scale Search Dataset for Unbiased Learning to Rank

The unbiased learning to rank (ULTR) problem has been greatly advanced b...
research
05/16/2022

Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data

This paper shows how to use large-scale pre-trained language models to e...

Please sign up or login with your details

Forgot password? Click here to reset