BertNet: Harvesting Knowledge Graphs from Pretrained Language Models

06/28/2022
by   Shibo Hao, et al.
22

Symbolic knowledge graphs (KGs) have been constructed either by expensive human crowdsourcing or with domain-specific complex information extraction pipelines. The emerging large pretrained language models (LMs), such as Bert, have shown to implicitly encode massive knowledge which can be queried with properly designed prompts. However, compared to the explicit KGs, the implict knowledge in the black-box LMs is often difficult to access or edit and lacks explainability. In this work, we aim at harvesting symbolic KGs from the LMs, a new framework for automatic KG construction empowered by the neural LMs' flexibility and scalability. Compared to prior works that often rely on large human annotated data or existing massive KGs, our approach requires only the minimal definition of relations as inputs, and hence is suitable for extracting knowledge of rich new relations not available before.The approach automatically generates diverse prompts, and performs efficient knowledge search within a given LM for consistent and extensive outputs. The harvested knowledge with our approach is substantially more accurate than with previous methods, as shown in both automatic and human evaluation. As a result, we derive from diverse LMs a family of new KGs (e.g., BertNet and RoBERTaNet) that contain a richer set of commonsense relations, including complex ones (e.g., "A is capable of but not good at B"), than the human-annotated KGs (e.g., ConceptNet). Besides, the resulting KGs also serve as a vehicle to interpret the respective source LMs, leading to new insights into the varying knowledge capability of different LMs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

Recent years have brought about a renewed interest in commonsense repres...
research
10/13/2022

Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models

Pretrained language models (PLMs) for data-to-text (D2T) generation can ...
research
06/22/2021

Do Language Models Perform Generalizable Commonsense Inference?

Inspired by evidence that pretrained language models (LMs) encode common...
research
05/08/2023

Enhancing Knowledge Graph Construction Using Large Language Models

The growing trend of Large Language Models (LLM) development has attract...
research
05/03/2023

PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives

Sustaining coherent and engaging narratives requires dialogue or storyte...
research
08/25/2023

Rethinking Language Models as Symbolic Knowledge Graphs

Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric...
research
11/01/2022

Evaluation Metrics for Symbolic Knowledge Extracted from Machine Learning Black Boxes: A Discussion Paper

As opaque decision systems are being increasingly adopted in almost any ...

Please sign up or login with your details

Forgot password? Click here to reset